Seqanswers Leaderboard Ad

**Brian Bushnell** · 05-12-2015, 07:34 PM

Whole-genome sequencing:

I don't see why indel-calling needs 4x the coverage of SNP-calling; 20x per ploidy seems fine to me for indel-calling, as it does for snp-calling. In fact, I suggest you mention somewhere on the page that the recommendations are for diploid genomes; you state

The coverage values below apply to most organisms while the read recommendations are for mammalian species with genome sizes of ~3Gb

but that does not really cover the issue of ploidy.

For CNVs... "1-8x" coverage seems really low to me. I would reject any data that calls virtually anything at 1x. It's important to mention the difference between amplified and unamplified libraries. I don't think amplified libraries are reliable for CNVs, due to amplification biases and randomness. Most of the time, you will probably see a 2x jump in coverage over a duplicated region using highly-amplified 8-fold coverage data... but I would not stake someone's life on that. The bias is reduced as you decrease the number of amplification cycles, but I don't know of a specific study that has analyzed this effect.

Whole-exome sequencing:

Calling a SNP homozygous at 3x coverage will be wrong (purely in terms of hom/het) ~1/8th of the time. I can hardly recommend a process that is wrong 1/8 of the time, though I should mention that when I wrote a variant caller, I got the best results when calling variants as low as 3x coverage. But I still don't recommend it as a guideline for planning things, particularly for exome-capture, which has an inherent ref-bias.

I had very good luck in calling indels from exome-capture data (consistent in trio studies, etc) but I assume it may be highly bait-system dependent. I only know about the ones that were called successfully, not what was missed, and I assume the ref-bias from baits is much more severe on indels than SNPs. So the recommendation of not selecting exome-capture with the intention of looking for indels seems appropriate. But I would still highly recommend people with exome-capture data to look for indels.

Transcriptome Sequencing/RNA-seq:

If people are interested in differential splicing, you should encourage them to use the longest possible reads (and paired reads). Also - the recommendations you have there are for a number of reads; but what is important is the transcriptome coverage, which varies by genome size and % of genome that is coding. I suggest you make your recommendations in terms of transcriptome coverage rather than a set number of reads (which does not consider read length, genome size, or transcriptome size).

I have not directly used the other categories so I'll defer to those who have.

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 30 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 32 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 28 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 52 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

Coverage and Read Depth Recommendations by Sequencing Application

Comment

Latest Articles

ad_right_rmr

News