Coverage calculation w/genome

ccard28

Member

Join Date: Jan 2012

Posts: 20
- Share
- Tweet
#1

Coverage calculation w/genome

09-27-2012, 12:12 PM

Hello,
I am trying to calculate depth of coverage for some RNA-Seq data we have in lab. I have tried reading through posts but can't seem to get a grasp on what should be a fairly simple formula I think. I have paired-end reads (18,750,000 reads per file) at 100 bp and am using the bovine genome(UMD_3.1/btau6) which appears to be 2,670,422,299 for a total sequence length. Would the total coverage be (37,500,000 x 100)/2,670,422,299? This would give only like a 1.42x coverage which seems really low. Is this accurate or am I doing this calculation wrong? Also, what is an acceptable depth of coverage? This is also the amount of reads pre-alignment, for an accurate depth of coverage would I need to take (# reads aligned x 100)/2,670,422,299? Thank you in advance for help clarifying this issue.
Tags: None
BAMseek

Senior Member

Join Date: Apr 2011

Posts: 124
- Share
- Tweet
#2

09-27-2012, 01:00 PM

Hi ccard28,

Since you are dealing with RNA-Seq data, you might want to look at the average coverage within the transcriptome. It's good to align to the genome, but you are probably interested in the average number of reads within the gene regions. Also, multiplying the number of aligned reads by the read length is a good back-of-the-envelope calculation, but it might not be quite right due to adapter trimming, quality trimming, soft clipping, reads that overhang your target regions . . . - all of which could cause the aligned bases of a read to be less than the read length.

Here would be my suggestion for calculating the total number of bases aligned to a target region:

Code:

samtools depth -b target.bed in.bam | awk '{s=s+$3};END{print s}'

So this would give you the sum of the aligned bases within your target (target.bed, which in your case would be the transcripts). You would then divide this by the total number of bases in the transcriptome (careful not to double count a base because it belongs to more than one transcript). I think samtools depth has a max depth of 8000, so that is one caveat when using it.

Justin
Comment

Previous template Next

Proteomic Platforms: How to Choose the Right Analytical Strategy to Improve Detection and Clinical Applications

by SEQadmin2

Proteomics platforms are evolving rapidly, with advances in mass spectrometry and affinity-based approaches expanding what researchers can detect and at what scale. As the field moves toward deeper proteome coverage and clinical applications, scientists face an increasingly complex landscape of tools. This article will explore how researchers are navigating these choices to find the right platform for their work.

The systematic characterization of the human proteome has...
- Channel: Articles
07-20-2026, 11:48 AM
Advanced Sequencing Platforms Tackle Neuroscience’s Toughest Genomics Problems

by SEQadmin2

Genomics studies in neuroscience face a special challenge due to the brain’s complexity and scarcity of samples. Mapping changes in cell type and state using conventional next-generation sequencing methods remains challenging. Advances in technologies like single-cell sequencing, spatial transcriptomics, and long-read sequencing have opened the door to deeper studies of the brain and diseases like Alzheimer’s, amyotrophic lateral sclerosis (ALS), and schizophrenia.
...
- Channel: Articles
07-09-2026, 11:10 AM
Cancer Drug Resistance: The Lingering Barrier to Rising Survival

by SEQadmin2

Cancer survival rates have significantly increased in the last few decades in the United States, reaching a combined 70% 5-year survival rate by 2021. Behind this number, there are years of research to find new therapies, drug targets, and early detection methods. But there is one core challenge that keeps slowing down these advances, and it’s about drug resistance.

There is no single reason why many patients don’t respond to treatment as expected. Cancer is...
- Channel: Articles
07-08-2026, 05:17 AM

Topics	Statistics	Last Post
Study Captures the First Moments of DNA Replication by SEQadmin2 Started by SEQadmin2, 07-24-2026, 12:17 PM	0 responses 25 views 0 reactions	Last Post by SEQadmin2 07-24-2026, 12:17 PM
Chemotherapy Leaves Detectable DNA Signatures in Childhood Tumors by SEQadmin2 Started by SEQadmin2, 07-23-2026, 11:41 AM	0 responses 19 views 0 reactions	Last Post by SEQadmin2 07-23-2026, 11:41 AM
Single-Cell Atlases Skew Toward European Ancestry, Analysis Finds by SEQadmin2 Started by SEQadmin2, 07-20-2026, 11:10 AM	0 responses 26 views 0 reactions	Last Post by SEQadmin2 07-20-2026, 11:10 AM
UC San Diego Bioengineers Map Gene Function in Human Stem Cells by SEQadmin2 Started by SEQadmin2, 07-13-2026, 10:26 AM	0 responses 38 views 0 reactions	Last Post by SEQadmin2 07-13-2026, 10:26 AM

Unconfigured Ad

Coverage calculation w/genome

Comment

Latest Articles

ad_right_rmr

News