The advent of NGS technologies has revolutionized the study of the human genome with regards to evolution, cancer research, rare disease studies, and population analysis. The original Sanger sequencing method evolved into short-read next-generation sequencing methods which produce highly accurate short reads (< 500bp) and are the current gold standard for clinical and research sequencing, most notably the technologies developed by Illumina. SNVs, CNVs and Indels can all be detected at high throughput capacity. One major shortcoming of short-read technology is the read length–it is too short to sequence over 70% of human structural variation. Areas with repeats or high GC content, which consequently are often involved in disease mechanisms, are particularly underrepresented within structural variation studies. Third-generation, or long-read technology, has been developed with the goal of sequencing long DNA molecules to achieve coverage of these hypervariable, difficult-to-sequence regions.
PacBio Technology
How it works
PacBio sequencing machines use single-molecule real-time (SMRT) sequencing to produce full-length reads of DNA molecules. A single-stranded circular DNA template, termed a SMRTbell template, is generated by ligating hairpin adapters to both ends of the double-stranded DNA template. A SMRTcell is a sequencing chip that contains many small pores called zero-mode waveguides (ZMWs) in which a DNA polymerase is immobilised. The PacBio sequel II system contains 8 million ZMWs per SMRTcell and the latest Revio platform contains 25 million ZMWs. The polymerase within the ZMW binds to the hairpin adapters and sequences the circular template. A read from a ZMW is termed a “continuous long read” (CLR).
This technology uses sequencing by synthesis, similar to that used by short-read technology, to determine the sequence of the DNA template. Each base is fluorescently labelled and produces a signature light pulse when incorporated by the polymerase, which is recorded. Due to the circularity of the template generated by ligating hairpin adapters, the polymerase can continue along the template to read the second strand. One complete read is known as a “pass”. The polymerase can produce multiple passes and each pass produces a subread for analysis – whereby the adapter sequences are discarded, and the sequence of the DNA template is retained. Multiple passes produce multiple subreads. The polymerase molecule has a limit to how many times it can efficiently read the template, with more reads possible with shorter templates. If multiple subreads exist within the CLR, these can be collapsed down to a single-molecule circular consensus sequence (CCS) which has improved accuracy compared to subreads (> 99.9%), as the random errors in individual subreads will be corrected by the other subreads. Hifi sequencing, available on the Sequel IIe system, produces shorter reads (<17kb) than traditional PacBio sequencing (>17kb) but the accuracy is higher.
Iso-Seq is PacBio’s method for full-length transcript RNA sequencing. PacBio HiFi reads, the method by which shorter sub-reads are assembled into a CCS, sequence full-length cDNA molecules up to 10 kb in length, and reconstruct the full exonic structure of genes with no assembly required, regardless of which isoform of the gene is transcribed (1).
Applications
PacBio technology can unravel the sequences of complex and highly repetitive regions of the genome, facilitating the investigation of disease and drug resistance mechanisms and the discovery of regulatory and structural elements. It allows for the analysis of genome-wide DNA methylation for the analysis of epigenetic modifications. The Iso-Seq technology can detect fusion genes, show alternative transcription start and end sites, characterize splicing events, and be fully leveraged for medical and agricultural research purposes, including the discovery of new genes, the study of plant development, and biotic and abiotic stresses. Iso-seq is available at single-cell level for the detection of isoforms in basic and disease research. De novo genome and transcriptome assembly are also major applications of PacBio technology.
Oxford Nanopore Technology
How it works
Oxford Nanopore Technology (ONT) is based on the passage of single-stranded DNA or RNA molecules through a staphylococcal a-hemolysin protein pore, known as a nanopore. The membranes used are embedded with thousands of nanopores. Adapters added to the strands to be sequenced facilitate their capture by the pore and a motor protein ligated to the adapter on the 5’ end, along with an applied ion current, moves the strand through the pore. Each nucleotide produces a characteristic interruption in ion current when it passes through the pore which is detected by sensors and recorded. The pore chemistry of the technology allows for uninterrupted traversing of long molecules across the pore membrane, with the main limitation being the preparation and extraction of good quality high molecular weight (HMW) DNA. This distinguishes between the generation of long reads and ultra-long reads. The technology can analyse modifications to DNA or RNA, such as methylation, by detecting current disruptions characteristic to those modifications (2).
ONT has three platforms that differ in their flow cell capacity – the MinION device is a portable, pocket sized, single flow cell. The GridION platform contains 5 flow cells with 512 channels. At Novogene we have the largest of the three platforms, the PromethION. It is ONT’s industrial scale production unit and contains 48 flow cells and 3000 channels. It has 6 times as many channels as the other platforms, can deliver six-fold more throughput per flow cell and generates 50–100 Gb of long-read data per flow cell. Using this platform, Novogene has achieved reads as long as 18.2 kb.
Applications
The applications of PacBio and Oxford Nanopore technologies have a lot of in common. They both sequence long DNA molecules which means they can detect similar characteristics – structural variants, coverage of hypervariable regions and high GC content regions, splice variants, and fusion genes. They of course have their advantages and disadvantages when compared to each other, such as the fact that PacBio (HiFi reads) produces data with lower error rates than ONT, while on the other hand, the ONT platforms provide longer reads than PacBio (CLR) platforms.
The main feature of ONT that sets it apart from PacBio is the portability of its MinION and GridION platforms. This allows for their application in the field or in rural regions where shipping samples to sequencing centres presents challenges, to provide real-time data to researchers. The MinION platform was most notably used in West Africa to analyse Ebola samples during the viral outbreak. In 2016, a group of scientists used the MinION platform to assemble genomes on the International Space Station and were able to show there was minimal difference in the quality of the sequencing in space compared to that on Earth (3).
In the 40 years since the advent of Sanger sequencing, nucleic acid sequencing technology has developed at an exponential rate. It is becoming increasingly more accessible and affordable and the exciting field applications of the technology are becoming clear. At Novogene we strive to continue delivering our customers the latest NGS technology and innovative tools to facilitate your research goals. We are excited to have been a part of this rapidly growing industry for the past decade and even more excited for what is to come over the next decade!
References
PacBio Technology
How it works
PacBio sequencing machines use single-molecule real-time (SMRT) sequencing to produce full-length reads of DNA molecules. A single-stranded circular DNA template, termed a SMRTbell template, is generated by ligating hairpin adapters to both ends of the double-stranded DNA template. A SMRTcell is a sequencing chip that contains many small pores called zero-mode waveguides (ZMWs) in which a DNA polymerase is immobilised. The PacBio sequel II system contains 8 million ZMWs per SMRTcell and the latest Revio platform contains 25 million ZMWs. The polymerase within the ZMW binds to the hairpin adapters and sequences the circular template. A read from a ZMW is termed a “continuous long read” (CLR).
This technology uses sequencing by synthesis, similar to that used by short-read technology, to determine the sequence of the DNA template. Each base is fluorescently labelled and produces a signature light pulse when incorporated by the polymerase, which is recorded. Due to the circularity of the template generated by ligating hairpin adapters, the polymerase can continue along the template to read the second strand. One complete read is known as a “pass”. The polymerase can produce multiple passes and each pass produces a subread for analysis – whereby the adapter sequences are discarded, and the sequence of the DNA template is retained. Multiple passes produce multiple subreads. The polymerase molecule has a limit to how many times it can efficiently read the template, with more reads possible with shorter templates. If multiple subreads exist within the CLR, these can be collapsed down to a single-molecule circular consensus sequence (CCS) which has improved accuracy compared to subreads (> 99.9%), as the random errors in individual subreads will be corrected by the other subreads. Hifi sequencing, available on the Sequel IIe system, produces shorter reads (<17kb) than traditional PacBio sequencing (>17kb) but the accuracy is higher.
Iso-Seq is PacBio’s method for full-length transcript RNA sequencing. PacBio HiFi reads, the method by which shorter sub-reads are assembled into a CCS, sequence full-length cDNA molecules up to 10 kb in length, and reconstruct the full exonic structure of genes with no assembly required, regardless of which isoform of the gene is transcribed (1).
Applications
PacBio technology can unravel the sequences of complex and highly repetitive regions of the genome, facilitating the investigation of disease and drug resistance mechanisms and the discovery of regulatory and structural elements. It allows for the analysis of genome-wide DNA methylation for the analysis of epigenetic modifications. The Iso-Seq technology can detect fusion genes, show alternative transcription start and end sites, characterize splicing events, and be fully leveraged for medical and agricultural research purposes, including the discovery of new genes, the study of plant development, and biotic and abiotic stresses. Iso-seq is available at single-cell level for the detection of isoforms in basic and disease research. De novo genome and transcriptome assembly are also major applications of PacBio technology.
Oxford Nanopore Technology
How it works
Oxford Nanopore Technology (ONT) is based on the passage of single-stranded DNA or RNA molecules through a staphylococcal a-hemolysin protein pore, known as a nanopore. The membranes used are embedded with thousands of nanopores. Adapters added to the strands to be sequenced facilitate their capture by the pore and a motor protein ligated to the adapter on the 5’ end, along with an applied ion current, moves the strand through the pore. Each nucleotide produces a characteristic interruption in ion current when it passes through the pore which is detected by sensors and recorded. The pore chemistry of the technology allows for uninterrupted traversing of long molecules across the pore membrane, with the main limitation being the preparation and extraction of good quality high molecular weight (HMW) DNA. This distinguishes between the generation of long reads and ultra-long reads. The technology can analyse modifications to DNA or RNA, such as methylation, by detecting current disruptions characteristic to those modifications (2).
ONT has three platforms that differ in their flow cell capacity – the MinION device is a portable, pocket sized, single flow cell. The GridION platform contains 5 flow cells with 512 channels. At Novogene we have the largest of the three platforms, the PromethION. It is ONT’s industrial scale production unit and contains 48 flow cells and 3000 channels. It has 6 times as many channels as the other platforms, can deliver six-fold more throughput per flow cell and generates 50–100 Gb of long-read data per flow cell. Using this platform, Novogene has achieved reads as long as 18.2 kb.
Applications
The applications of PacBio and Oxford Nanopore technologies have a lot of in common. They both sequence long DNA molecules which means they can detect similar characteristics – structural variants, coverage of hypervariable regions and high GC content regions, splice variants, and fusion genes. They of course have their advantages and disadvantages when compared to each other, such as the fact that PacBio (HiFi reads) produces data with lower error rates than ONT, while on the other hand, the ONT platforms provide longer reads than PacBio (CLR) platforms.
The main feature of ONT that sets it apart from PacBio is the portability of its MinION and GridION platforms. This allows for their application in the field or in rural regions where shipping samples to sequencing centres presents challenges, to provide real-time data to researchers. The MinION platform was most notably used in West Africa to analyse Ebola samples during the viral outbreak. In 2016, a group of scientists used the MinION platform to assemble genomes on the International Space Station and were able to show there was minimal difference in the quality of the sequencing in space compared to that on Earth (3).
In the 40 years since the advent of Sanger sequencing, nucleic acid sequencing technology has developed at an exponential rate. It is becoming increasingly more accessible and affordable and the exciting field applications of the technology are becoming clear. At Novogene we strive to continue delivering our customers the latest NGS technology and innovative tools to facilitate your research goals. We are excited to have been a part of this rapidly growing industry for the past decade and even more excited for what is to come over the next decade!
References
- Pacific BioSciences. https://www.pacb.com/. Accessed 20/12/2022.
- Oxford Nanopore Technology. https://nanoporetech.com/. Accessed 20/12/2022.
- Castro-Wallace, S. L. et al. Nanopore DNA Sequencing and Genome Assembly on the International Space Station. bioRxiv (2016).
- WGS vs WES: Which Genetic Sequencing Method is Right for You?
- Expanding Horizons in Genomic Research with Long-Read Sequencing
- A Basic Guide to RNA-sequencing
- What you can explore with non-coding RNA data
- Uncovering the Genetic Basis of Rare and Complex Diseases through Whole Genome Sequencing