Next-Generation Sequencing and the Transformation of Cancer Care
Traditionally, DNA sequencing relied on synthesis-based methods, independently developed in the late 1970s by Frederick Sanger (DNA chain terminating inhibitors) and Walter Gilbert and Allan Maxam (chemical degradation). Both have been extremely successful and widely used in genomics and molecular biology, but have undergone incremental improvements since their discovery. The Sanger method became robust enough to allow the introduction of automated capillary technology, and became sufficiently powerful to support the elucidation of the first human genome in 2001.
Since then, a drive for increased sequencing throughput and cost reduction, combined with developments in bioinformatics and novel approaches in microfabrication, have paved the way for technology capable of sequencing, in parallel, massive amounts of DNA templates. Broadly called Next-Generation Sequencing (NGS), it represents a major advance in the biological sciences in the last few decades. NGS is a rapidly evolving field, with a second-generation (2G) group of platforms, which rely on clonal amplification of templates, and progress to third-generation (3G) systems that employ single molecule templates and cycle-free chemistry. More recently, fourth-generation (4G) sequencers incorporating nanopore technologies are being developed.
NGS is revolutionizing the field of genome biology, with much faster data generation, increased accuracy, and a dramatic reduction of sequencing costs– as many as five orders of magnitude. Multiple genomes can now be sequenced in parallel by a single instrument in a matter of days. The technology is enabling global collaborative projects such as the Cancer Genome Atlas, which aims to interpret the sequence of thousands of cancers. In the medical field, NGS is already having an impact in genetic screening and holds great potential in oncology, given the genetic aspects of cancerous disease. It can have the capability of identifying rare but clinically important mutations, to classify tumors, assess tumor progression and assist in the design and use of targeted therapeutics, monitoring response and predicting outcomes.
Most bioinformatic approaches for processing NGS data were initially designed for genomes present in normal, healthy organisms and tissues. In contrast, cancerous cells and tissues present unique challenges, such as an enormous variety of genetic aberrations, which make it difficult to rely on standard reference sequences. Structural, high-scale chromosomal alterations are common, as well as epigenomic changes. An additional level of complexity arises from heterogeneity within tumors. NGS bioinformatic tools available are still insufficiently suited for analyzing cancer genomic data, therefore requiring a very close collaboration between bioinformaticians and clinicians in order to process and interpret the NGS output optimally before it can be applied clinically. Some bioinformatic solutions are being actively developed to process cancer-specific NGS data, such as software programs designed to identify insertions and deletions (indel), copy number variants (CNVs) or somatic mutations using pairs from tumor and normal sources.
Another critical factor in cancer-specific NGS is the need for longer read lengths. De novo genome assemblies are impossible with most genomes due to repeat sequences that are longer than the read lengths. This is particularly important in tumors, where variation in copy number or genome structure has significant impacts on data interpretation. In the past few years, the emergence of longer-read length technologies, employed by Nabsys [1], is enabling access to more extensive and accurate datasets, generating information about the size, orientation and location of large-scale structural variants. Nabsys has developed the first solid-state measurement of individual DNA molecules. This is similar to a computer electronic chip found on commercial electronic devices, except DNA molecules physically flow through the chip and are read electronically, as they pass through the chip. DNA translocates through the nanochannel at a rate of one million bases per second, per channel. The electronic detection utilizes markers or tags placed at intervals below the diffraction limit of light. This technology provides superior results compared with current long-distance mapping technologies, delivering superior resolution and accuracy, and very rapid time-to-results. This approach also offers a cost-effective method to create genome-wide positional maps, validate short-read assembled contigs, aid in sequence assembly of NGS data and identify/analyze structural abnormalities.
The Nabsys technology utilizes sample read length of tens of thousands to hundreds of thousands of bases in length, enabling the analysis of all sizes of DNA, from single-nucleotide variants to large structural changes. Positional sequencing is already being used for genome mapping and assembly, and for the detection of structural variants. As the throughput of the technology continues increasing, it promises to provide novel solutions in oncology, like mapping heterogeneous tumor samples.
NGS is opening the field to unprecedented opportunities to analyze the genetic determinants of cancer, to categorize tumors, and to guide treatment management. In the area of cancer diagnostics, NGS is rapidly evolving to increase reliability and ease of use in the clinic, while enabling the analysis of limited amounts of clinical sample, within suitable time-to-results. While the unique challenges inherent to cancer genomics will require a few years to overcome, improvements are occurring at a rapid pace.