Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • ritzriya
    Member
    • Jun 2010
    • 49

    Check coverage from RNA-seq data???

    Dear All,

    I have read in a lot of forums as well as discussions posted on the web, which say that the way coverage is calculated for genome, is not the way we can get for transcriptome data as the transcripts level will change from sample to sample. I agree to this.

    All I want to knw is If my client says they have given us 20x coverage in RNA-seq data for a particular organism, then how can I reconfirm that the coverage is really 20x? Is there anyway to do this?

    Any insight on this will be help me a lot...

    Thanks in advance..
  • ritzriya
    Member
    • Jun 2010
    • 49

    #2
    No answers yet!! 99 viewers but no solution to this problem?

    Please help..

    Comment

    • Jeremy
      Senior Member
      • Nov 2009
      • 190

      #3
      Is it normalised RNA?

      Comment

      • ritzriya
        Member
        • Jun 2010
        • 49

        #4
        Dear Jeremy,

        Well, I am not sure of that as the Wet lab experiment has been done by different group of people. Say, I assume its normalised (I shall ask them for the same though :-( ), then what next?

        Thanks..

        Comment

        • Jeremy
          Senior Member
          • Nov 2009
          • 190

          #5
          What platform was used? And was it paired-end sequencing?

          You should be able to mine the information that you want out of the data that is available. A good way to gauge how well you have covered the transcriptome is by the number of singletons you obtain.

          Comment

          • ritzriya
            Member
            • Jun 2010
            • 49

            #6
            Originally posted by Jeremy View Post
            What platform was used? And was it paired-end sequencing?

            You should be able to mine the information that you want out of the data that is available. A good way to gauge how well you have covered the transcriptome is by the number of singletons you obtain.
            It was Illumina GAIIx and yes it is paired-end sequencing.

            That was indeed a useful information! Thanks.. but for genome we can calculate, it cannot be calculated for that particular tissue/celltype related RNA-seq reads for an organism??

            Comment

            • seidel
              Junior Member
              • Mar 2008
              • 3

              #7
              Wouldn't the way to calculate average coverage for a transcriptome be to just take a description of all the known exons (e.g. from UCSC or Ensembl), and calculate the average depth across the features? Sure there may be novel exons, but as an estimate, average coverage across known features would give an average depth. No?

              Comment

              • Jeremy
                Senior Member
                • Nov 2009
                • 190

                #8
                Originally posted by seidel View Post
                Wouldn't the way to calculate average coverage for a transcriptome be to just take a description of all the known exons (e.g. from UCSC or Ensembl), and calculate the average depth across the features? Sure there may be novel exons, but as an estimate, average coverage across known features would give an average depth. No?
                If you have normalised RNA maybe, and even then there is still a fairly large copy number difference between highly expressed and lowly expressed genes. Don't forget though that a different set of genes is expressed in each tissue type, so that method would give zero for many genes simply because they naturally are not present.
                If its not normalised then you are taking the average across genes that have expression differences of several fold, which isn't a very good indicator.
                Last edited by Jeremy; 02-13-2011, 07:11 PM.

                Comment

                • dariober
                  Senior Member
                  • May 2010
                  • 311

                  #9
                  Hello,
                  I agree in that looking for a 'transcriptome coverage' is not sensible since in contrast to the genome, the transcriptome varies over time and tissue.

                  Maybe a more meaningful statistics to assess the depth of a transcriptome sequencing is in terms of 'transcript detection threshold', i.e. What is the minimal expression level that my sequenced library can detect? So, if in a typical (!?) human cell you have approximately 300000 mRNA molecules (see http://bionumbers.hms.harvard.edu/bi...r=3&hlid=43015) than with 3 million reads you are able to assign ~10 reads to a transcript expressed at a level of 1 molecule/cell. (...I'm aware there is a lot of hand waving here).

                  Does it make sense, at least in principle?

                  My 2p
                  Dario

                  Comment

                  • ritzriya
                    Member
                    • Jun 2010
                    • 49

                    #10
                    Originally posted by Jeremy View Post
                    If you have normalised RNA maybe, and even then there is still a fairly large copy number difference between highly expressed and lowly expressed genes. Don't forget though that a different set of genes is expressed in each tissue type, so that method would give zero for many genes simply because they naturally are not present.
                    If its not normalised then you are taking the average across genes that have expression differences of several fold, which isn't a very good indicator.
                    I agree with Jeremy. But I know what normalization is - though I am unclear what normalization would mean in terms of RNA-Seq reads. Is it filtering of low quality reads or plainly removing redundant reads from the data? Apologies for the silly question

                    Comment

                    • Jeremy
                      Senior Member
                      • Nov 2009
                      • 190

                      #11
                      By normalised RNA, I mean the use of a normalised RNA library. Not sure if this is the most appropriate reference, but it gives the idea and a starting point - Construction and characterization of a normalized cDNA library. I think I remember reading about someone who has used that approach (but probably not the exact same method from that reference) in here somewhere, but not sure.

                      This review RNA-Seq: a revolutionary tool for transcriptomics cites some work where they investigate coverage. But again its not something that can be meaningfully applied to a single sequence run.

                      Comment

                      • Thorondor
                        Member
                        • Feb 2011
                        • 69

                        #12
                        the coverage wont be homogeneously in your transcriptome.

                        you could cluster your reads which will give you an idea about the coverage.

                        but how do your clienst know that he sends you a 20x cov RNA-seq, seems to be more like a wild guess.

                        Comment

                        • ritzriya
                          Member
                          • Jun 2010
                          • 49

                          #13
                          Thanks Jeremy, I have read the articles and I have a better idea of normalization of a library now.

                          @thorondor - If I am not wrong, a sequencer can be set to produce the reads with particular coverage required once the sample preparation has been done keeping in mind that approximately 20x coverage needs to be obtained.
                          P.S.Do correct me .. I am more into Bioinformatics and have lesser knowledge in wetlab..

                          And on what basis shall I cluster my reads?

                          Comment

                          • Thorondor
                            Member
                            • Feb 2011
                            • 69

                            #14
                            that might be so, but then you imply that the sample is perfectly normalized, which is not the case. ;-) I am also more into bioinformatics. ;-)

                            Still it is recommended to trim your reads and i doubt that after trimming your coverage is still around 20x. Or is it not your job to assemble them?

                            only clustering the reads is not the best option, that's my new conclusion after some more thinking.
                            better first trim, then assemble, then map back the reads to your assembled contigs. Or assemble with velvet, in the contigs id there will be the coverage of the contig. e.g. >NODE_xxxx_length_xxxx.xxxx_cov_xxxxxx.xxxxx

                            Comment

                            • colindaven
                              Senior Member
                              • Oct 2008
                              • 417

                              #15
                              If I were you I wouldn't worry so much about these coverage claims. As others have suggested just try looking at how many reads were aligned to each gene / exon/ transcript and look at some summary boxplots.

                              I have found the bioconductor package edgeR and typical R boxplots to be excellent for this. There's an excellent guide for edgeR on this site too.

                              Comment

                              Latest Articles

                              Collapse

                              • SEQadmin2
                                From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                                by SEQadmin2


                                Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                                The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                                ...
                                06-02-2026, 10:05 AM
                              • SEQadmin2
                                Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
                                by SEQadmin2


                                With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


                                Introduction

                                Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
                                05-22-2026, 06:42 AM
                              • SEQadmin2
                                Environmental Genomics in the Age of NGS: From Microbes to Conservation Strategies
                                by SEQadmin2

                                Studying ecosystems means dealing with complex, multi-species communities that are hard to observe at scale. This complexity, however, hides many important questions to be answered, from how biogeochemical cycles work and how climate change can affect species distribution to how conservation strategies can work best.


                                Genomics, particularly since the expansion of NGS, has transformed ecosystem ecology. By sequencing environmental DNA, we can now assess biodiversity without direct...
                                05-06-2026, 09:04 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by SEQadmin2, Yesterday, 08:59 AM
                              0 responses
                              14 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 06-02-2026, 12:03 PM
                              0 responses
                              22 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 06-02-2026, 11:40 AM
                              0 responses
                              19 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 05-28-2026, 11:40 AM
                              0 responses
                              32 views
                              0 reactions
                              Last Post SEQadmin2  
                              Working...