Welcome to the New Seqanswers!

Welcome to the new Seqanswers! We'd love your feedback, please post any you have to this topic: New Seqanswers Feedback.
See more
See less

Coverage standards for RNA-sequencing

  • Filter
  • Time
  • Show
Clear All
new posts

  • Coverage standards for RNA-sequencing


    As part of my new faculty appointment I am the faculty adviser for my department helping get off the ground an Illumina sequencing core for the university.

    While trying to put together some guidelines regarding sequencing coverage I became quite confused as to what is right,
    Can anyone refer me to the most recent best practices or good papers dealing with this issue?

    The original ENCODE recommendations do not agree much with my experience.
    Outside the fact that you need at least 3 and not 2 biological replicates to do good stats the 30M PE reads do not seem enough according to my calculation bellow:

    Given a Human Genome size of 3 billion bp, assuming that 80% of the reads will be mapped with high accuracy and estimating that 10% of the genome makes polyA RNA (this is the proportion of the genome I usually end up mapping to)
    the average coverage of 30M 100 bp reads (0.03 billion reads) is: (0.03x100x0.8)/(3x0.1)= 8X
    this seems really low, is my calculation correct?

    is my mistake assuming that 10% of the genome gets mapped (if we assume 2% then you get 40X coverage, but that is not my experience)

    thanks in advance for the feedback

  • #2
    Coverage requirements depend upon your experiment. For differential gene expression with a well-annotated genome, biological triplicates at 15M single-end 50bp reads may suffice (e.g., see here). For isoform quantification or transcriptome assembly, 50M PE-100bp reads may be inadequate.


    • #3
      Depth recommendations from GenoHub (use as a guide).


      • #4
        Average coverage is not meaningful for RNA-Seq.
        Coverage is related to the level of expression of the gene.
        More "reads will be captured from highly expressed genes, and few reads will be captured from genes expressed at low levels."


        • #5
          "A transcriptome represents that small percentage of the genetic code that is transcribed into RNA molecules — estimated to be less than 5% of the genome in humans (Frith et al., 2005)."


          • #6
            Coverage and Read Depth by Sequencing Application - a new guide