Skip to content
The spectrum of somatic single-nucleotide variants in cancer genomes often reflects the signatures of multiple distinct mutational processes, which can provide clinically actionable insights into cancer etiology. Existing software tools for identifying and evaluating these mutational signatures do not scale to analyze large datasets containing thousands of individuals or millions of variants.
We introduce Helmsman, a program designed to perform mutation signature analysis on arbitrarily large sequencing datasets.
Whole-genome sequencing data must go through extensive quality control measures to ensure that the variants identified in an individual’s genome are true biological differences and not the result of errors that can occur throughout the many stages of sample preparation and sequencing. Many such errors can be avoided by collecting, storing, transferring, and preparing the biological samples according to established best practices. Human error is inevitable, however, and sometimes a few DNA samples will get degraded or oxidized or sloshed into another well of the plate, etc.