Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • Genohub
    Registered Vendor
    • Mar 2013
    • 210

    Coverage and Read Depth Recommendations by Sequencing Application

    Genohub is in the process of developing an evolving coverage and read depth guide: https://genohub.com/recommended-sequ...y-application/ based on references in the field. We'd like to ask this community for feedback and references to improve this guide.

    - Genohub
  • Brian Bushnell
    Super Moderator
    • Jan 2014
    • 2709

    #2
    Whole-genome sequencing:

    I don't see why indel-calling needs 4x the coverage of SNP-calling; 20x per ploidy seems fine to me for indel-calling, as it does for snp-calling. In fact, I suggest you mention somewhere on the page that the recommendations are for diploid genomes; you state
    The coverage values below apply to most organisms while the read recommendations are for mammalian species with genome sizes of ~3Gb
    but that does not really cover the issue of ploidy.

    For CNVs... "1-8x" coverage seems really low to me. I would reject any data that calls virtually anything at 1x. It's important to mention the difference between amplified and unamplified libraries. I don't think amplified libraries are reliable for CNVs, due to amplification biases and randomness. Most of the time, you will probably see a 2x jump in coverage over a duplicated region using highly-amplified 8-fold coverage data... but I would not stake someone's life on that. The bias is reduced as you decrease the number of amplification cycles, but I don't know of a specific study that has analyzed this effect.

    Whole-exome sequencing:

    Calling a SNP homozygous at 3x coverage will be wrong (purely in terms of hom/het) ~1/8th of the time. I can hardly recommend a process that is wrong 1/8 of the time, though I should mention that when I wrote a variant caller, I got the best results when calling variants as low as 3x coverage. But I still don't recommend it as a guideline for planning things, particularly for exome-capture, which has an inherent ref-bias.

    I had very good luck in calling indels from exome-capture data (consistent in trio studies, etc) but I assume it may be highly bait-system dependent. I only know about the ones that were called successfully, not what was missed, and I assume the ref-bias from baits is much more severe on indels than SNPs. So the recommendation of not selecting exome-capture with the intention of looking for indels seems appropriate. But I would still highly recommend people with exome-capture data to look for indels.

    Transcriptome Sequencing/RNA-seq:

    If people are interested in differential splicing, you should encourage them to use the longest possible reads (and paired reads). Also - the recommendations you have there are for a number of reads; but what is important is the transcriptome coverage, which varies by genome size and % of genome that is coding. I suggest you make your recommendations in terms of transcriptome coverage rather than a set number of reads (which does not consider read length, genome size, or transcriptome size).

    I have not directly used the other categories so I'll defer to those who have.
    Last edited by Brian Bushnell; 05-12-2015, 07:37 PM.

    Comment

    Latest Articles

    Collapse

    • GATTACAT
      Reply to Nine Things a Sample Prep Scientist Thinks About Before Sequencing
      by GATTACAT
      Love this - good data definitely starts from good input, and poor input can only give relatively poor data. I particularly like the mention of Nanodrop/absorbance based methods for quantification. It's such a toss up if you'll get an accurate reading or what amounts to a randomly generated number, and a lot of library/sequencing related issues can be traced back to poor quant.
      Yesterday, 11:43 AM
    • SEQadmin2
      Nine Things a Sample Prep Scientist Thinks About Before Sequencing
      by SEQadmin2


      I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

      Here are nine questions we think about, in roughly the order they matter, before...
      06-18-2026, 07:11 AM

    ad_right_rmr

    Collapse

    News

    Collapse

    Topics Statistics Last Post
    Started by SEQadmin2, Today, 11:08 AM
    0 responses
    6 views
    0 reactions
    Last Post SEQadmin2  
    Started by SEQadmin2, 06-30-2026, 05:37 AM
    0 responses
    11 views
    0 reactions
    Last Post SEQadmin2  
    Started by SEQadmin2, 06-26-2026, 11:10 AM
    0 responses
    18 views
    0 reactions
    Last Post SEQadmin2  
    Started by SEQadmin2, 06-17-2026, 06:09 AM
    0 responses
    52 views
    0 reactions
    Last Post SEQadmin2  
    Working...