Sequencing mRNA provides a snapshot of cellular activity, allowing researchers to study the dynamics of cellular processes, compare gene expression across different tissue types, and gain insights into the mechanisms of complex diseases. “mRNA’s central role in the dogma of molecular biology makes it a logical and relevant focus for transcriptomic studies,” stated Sebastian Aguilar Pierlé, Ph.D., Application Development Lead at Inorevia. “One of the major hurdles for mRNA studies is its poor representation in total RNA (3-7%).” Fortunately, the development of specialized library preparation methods for mRNA has vastly improved the ability to target and enrich these molecules, enabling scientists to focus their efforts on this key area of the transcriptome. These refined methods make it possible to explore mRNA with greater accuracy, uncovering its nuanced roles in cellular biology.
Different Methodologies
Emily Leproust, CEO and Co-founder of Twist Bioscience, explained that there are two main approaches to sequence mRNA. The first is to sequence the RNA directly. Leproust shared that directly sequencing the RNA has several benefits, such as the ability to identify RNA base modifications, but it comes with drawbacks like high expenses, substantial input needs, and reduced accuracy using most present-day tools. An alternative and often preferred approach is to create a cDNA copy and sequence that DNA.
Evan Janzen, Ph.D., NEBNext Development Scientist at New England Biolabs (NEB) pointed out that mRNA library construction can also be categorized by the sequencing platform used. For example, the enriched mRNA for short-read instruments is fragmented and converted to cDNA, followed by adaptor ligation and indexing PCR to prepare full-length libraries. Long-read sequencing involves reverse transcription of unfragmented transcripts, often using an oligo(dT) primer, to generate full-length cDNA before adding sequencing adaptors. Janzen noted that the mRNA library prep methods for short-read sequencing platforms are utilized by most users and that they have the advantages of lower RNA input requirements, higher accuracy, improved transcript detection, and increased throughput. Conversely, he shared that mRNA library prep for long-read sequencing enables full-length transcript sequencing, allowing the characterization of transcription start and stop sites, alternative splicing isoforms, and fusion transcripts.
A typical RNA library prep workflow begins with mRNA enrichment, explained Chen Song, Ph.D., NEBNext Development Scientist at New England Biolabs. This involves using either poly(A) RNA enrichment or ribosomal RNA (rRNA) depletion, followed by library construction from the enriched RNA fraction. Poly(A) RNA enrichment targets the polyadenylated tail of mRNA and is suitable for high-quality RNA samples, providing better coverage of coding transcripts. Song further explained that rRNA depletion removes rRNA to enhance coding transcript representation, which is effective for low-quality and fragmented RNA samples but may result in lower coding region coverage as it retains noncoding transcripts.
Choosing an Approach
Within the various methodologies, there are several recommendations for deciding which approach to use. “It all comes down to the characterization and identity of the sample and the scientific question that needs answering,” explained Pierlé. When characterizing a sample, Pierlé noted that it is essential to implement quality controls that precisely assess the integrity and quantity of the samples. Additionally, he emphasized that understanding its taxonomical identity is important, especially in relation to the chosen enrichment strategies and their compatibility with species that do or do not produce polyadenylated transcripts.
Leproust similarly explained that the choice of library preparation method for RNA sequencing depends largely on the specific goals of the research and the sample type. For studies focused on a limited number of genes, simple pulldowns are adequate, while comprehensive studies of all mRNA, particularly from samples like blood that contain high levels of rRNA and hemoglobin, require the use of specialized kits for rRNA and globin depletion. In addition, Leproust emphasized the importance of using the enrichment options that allow users to target their transcripts of interest. This includes their own Twist RNA Exome panels which are particularly useful for broader transcriptomic analyses, as well as targeted viral detection panels, which are capable of identifying known and novel viruses.
Janzen highlighted the process of preparing RNA samples, noting that a study starts with evaluating the RNA quality with a high RNA integrity number or DV200 score. The high-quality samples may use poly(A) enrichment for mRNA enrichment, while lower-quality, fragmented samples should use rRNA depletion. Janzen also noted that the amount of RNA must fit the library prep method's requirements, and features that streamline the workflow and automate to increase efficiency and reduce costs should be considered. “To answer more specific biological questions from the study, researchers should also consider whether the library prep method can utilize a unique molecular identifier (UMI) that will improve the accuracy of transcript quantification,” stated Janzen. Furthermore, if the research study is to identify RNA structural features and modifications, Janzen noted that a library prep for long-read sequencing may be more suitable.
Optimization
When it comes to optimization strategies, Pierlé pointed out that “accurate quantification and integrity estimation are a must.” He favors fluorescent dye-based methods for quantification as they are accessible and provide sufficient sensitivity. This can also be paired with miniaturized agarose gel chips for electrophoresis that allow for integrity calculations. Having an accurate read on these two metrics, Pierlé shared, will orient your choice of the enrichment method, as well as the appropriate number of amplification cycles. Overamplification can lead to artifacts and data distortion, while underamplification results in low yields. To minimize errors, Pierlé advised adhering strictly to cycle number recommendations tailored to secure sufficient sequencing material without overamplification risks. Pierlé also emphasized that regular library quality checks can reveal necessary optimizations and resources like SEQanswers are invaluable for troubleshooting and expert advice.
Optimizing mRNA library prep also involves improving yields and conversion rates at each workflow step, Song noted. For instance, when using a beads-based approach to pull down poly(A) RNA, researchers should choose optimal binding, washing, and elution conditions to maximize the yield. During rRNA depletion steps, Song emphasized using the right probes for the targeted species because this typically involves cleavage of rRNA, and protocols with efficient probes and digestion will maximize library yields. “Use of optimized RT reaction components and incubation times improves the conversion rate from mRNA to cDNA,” advised Song. Furthermore, she noted that adaptor ligation requires tuning adaptor amounts based on RNA input to avoid dimer formation, especially with UMIs, which need more precise cleanup. PCR amplification cycles should be adapted to RNA input for maximum library yields, with optimized cleanup to remove excess primers and dimers. Finally, proper tuning of PCR cycles is important for target enrichment protocols, while avoiding overamplification to reduce bias in RNA counting applications.
Sample availability and quality are other important determinants in optimizing experimental results. Leproust recommends using a versatile library preparation kit that can handle various mass and degradation states which optimizes outcomes for both target enrichment and whole transcriptome workflows. “RNA is a fragile molecule so it’s important that it’s treated with care during mRNA library prep,” she added. Ensuring successful extraction and purification of high-quality RNA without impurities or inhibitors generally leads to better results. For degraded or low-quality samples, such as FFPE samples, Leproust advised utilizing a library prep kit capable of accepting a wide range of input amounts and RNA qualities.
Evolution of mRNA Library Preps
Over the years, mRNA library preparations have progressed significantly. In particular, Janzen noted three areas of major change. First, improved rRNA depletion methods with diverse chemistries and species applicability have led to higher depletion levels. Second, the streamlining of RNA-seq workflows has reduced time, cost, plastic use, and errors, enhancing overall efficiency. Janzen shared that the new NEBNext UltraExpress RNA library prep kit is a good example of this, as it cuts the workflow time in half, and reduces the number of tips and tubes used by 21% and 25%, respectively. Lastly, Janzen explained that the expansion of mRNA library preps for long-read sequencing now offers comprehensive transcript coverage, which is beneficial for de novo transcriptome assembly and identifying splice variants due to the improved gene annotation provided by longer reads.
Building on these advancements, Pierlé shared that enrichment strategies have evolved to better serve various needs. This evolution includes 3’ mRNA-seq, which bypasses the limitations of poly(A) selection on degraded materials, and new depletion methods with diverse, commercially available probe cocktails for various organisms. Another innovation that Pierlé considers ubiquitous is the preservation of strand information. “Without such information, it is difficult if not impossible to accurately quantify gene expression levels for genes with overlapping genomic loci transcribed from opposite strands,” he stated. Additionally, it uncovers previously unknown layers of transcriptional regulation and demonstrates the power of transcriptomics for transcript discovery, genome annotation, and expression profiling. Pierlé also highlighted that the incorporation of UMIs is a significant improvement, particularly useful in mRNA studies, as they enable the bioinformatic identification of PCR duplicates, preventing their detrimental impact on downstream data analysis.
“Today, there are many flavors of library prep for different applications,” stated Leproust. This diversity enables scientists to access RNA for specialized applications with minimal input. Significant advancements in the enzymes within library prep kits, particularly the improved reverse transcriptase, allow for lower input requirements while achieving higher yields. These improvements increase experimental efficiency while also facilitating the study of rare or low-abundance transcripts. Leproust believes that future advancements in enzymes hold promise to further improve experimental outcomes.
Looking Ahead
With these advances in such a short time, it’s exciting to envision the future of library preparations. Overall, Song predicts an increase in demand for more RNA-based library preparation methods for specific applications, such as healthcare and agriculture. “Depending on the application, modifications may need to be made at each step of the mRNA library prep workflow to suit specific needs,” she added. However, Song also noted that a general protocol meeting high-throughput sequencing needs will be in demand, with further improvement of efficiency, speed, and reducing costs driving broader adoption in various settings.
Leproust anticipates future trends in mRNA library preps will focus on increasing efficiency and reducing hands-on time as sequencing costs decline and the scale of experiments grows. She noted that current library preps can be costly and labor-intensive due to the need for sample quantification and pooling, but that her group is addressing these challenges and developing streamlined solutions. Furthermore, the future of library prep will likely involve novel enzymes tailored to specific needs, such as better ligases for single-cell sequencing with low-input samples and high-conversion efficiencies. These advancements are needed, Leproust noted, as researchers move toward experiments that require high fidelity in RNA sequencing to accurately analyze both mutations and transcript abundance.
Discussing emerging trends in mRNA library preparation, Pierlé observed that the extensive length of many established methods is motivating reagent vendors to streamline protocols and cut experiment times in half. Currently, there is a growing need to reduce the required input material, especially for rare samples and single-cell studies. In addition, methods like Live-seq represent a significant development, enabling ongoing single-cell transcriptome profiling without killing the cell and allowing for time-point analyses in live cells1. Pierlé also emphasized the importance of RNA-seq, explaining that their Magelia platform offers fully automated, overnight workflows, and produces quality-controlled libraries efficiently. He noted that another burgeoning area is in epitranscriptomics, where advancements in RNA modification detection, like m6A, are deepening our understanding of mRNA stability and its broader biological impacts2. Pierlé concluded that “the continued renewal and expansion of the field are and will be a major player in the understanding of higher-order biological processes.”
References
- Chen, W. et al. Live-seq enables temporal transcriptomic recording of single cells. Nature. 2022 608:7924 608, 733–740 (2022).
- Boo, S. H. & Kim, Y. K. The emerging role of RNA modifications in the regulation of mRNA stability. Experimental & Molecular Medicine. 2020 52:3 52, 400–408 (2020).