Getting mapped read count from a BED file - bedtools coverage?

samhokin

Member

Join Date: Nov 2013

Posts: 20
- Share
- Tweet
#1

Getting mapped read count from a BED file - bedtools coverage?

12-20-2013, 02:22 PM

Hi, all. I've been trying to analyze an experiment that I downloaded from GEO, GSE34241, which has four samples assayed with RNA-Seq (AB SOLiD System). Apart from being interested in some gene expression in this experiment, I'm using it as a tutorial for dealing with new file formats (which never ends).

The authors did not upload any data in the Series Matrix or SOFT files. Instead, they uploaded four BED files. After spending probably way too much time trying to figure out how to extract matches against the TAIR10 genome, I finally downloaded the latest bedtools2 from github, and lo and behold it has a nice coverage sub-command that works with these files. I've checked the first output number, # features in sample file that overlap the interval in the genome file, and it pans out for some genes I know. The other three outputs are: # bases in genome file that had non-zero coverage; length of entry in genome file; fraction of bases in genome file that had non-zero coverage.

SO, I'm tempted to use the first number, # features that overlap, as my read counts to do the usual further analysis with DESeq2 (normaliztion, DE analysis). But are there some things I should look out for from bedtools coverage output, like, say, if the fraction of bases in the genome file that were not covered is large, for example?

The command I used is, for example:

bedtools coverage -a GSM845432_F1DPI_TAIR10.bed -b TAIR10_GFF3_genes.gff > GSM845432_F1DPI_TAIR10.txt

Thanks for any tips! This is a learning exercise, the RNA-Seq data that my lab generated was read-mapped by DNANexus and I'm hoping that they know what they're doing; at least it's a standardized workflow.

Sam Hokin
Computational Scientist, Carnegie and NCGR
Tags: None

Previous template Next

Reply to Nine Things a Sample Prep Scientist Thinks About Before Sequencing

by GATTACAT

Love this - good data definitely starts from good input, and poor input can only give relatively poor data. I particularly like the mention of Nanodrop/absorbance based methods for quantification. It's such a toss up if you'll get an accurate reading or what amounts to a randomly generated number, and a lot of library/sequencing related issues can be traced back to poor quant.
- Channel: Articles
Yesterday, 11:43 AM
Nine Things a Sample Prep Scientist Thinks About Before Sequencing

by SEQadmin2

I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

Here are nine questions we think about, in roughly the order they matter, before...
- Channel: Articles
06-18-2026, 07:11 AM

Topics	Statistics	Last Post
High-Resolution Sequencing Exposes Hidden Toxoplasma Diversity by SEQadmin2 Started by SEQadmin2, Today, 11:08 AM	0 responses 6 views 0 reactions	Last Post by SEQadmin2 Today, 11:08 AM
New AI Model Captures Long-Range Genomic Signals to Improve RNA Splice Site Prediction by SEQadmin2 Started by SEQadmin2, 06-30-2026, 05:37 AM	0 responses 11 views 0 reactions	Last Post by SEQadmin2 06-30-2026, 05:37 AM
Large-Scale Protein Screen Uncovers Hidden Regulators of Alternative Polyadenylation by SEQadmin2 Started by SEQadmin2, 06-26-2026, 11:10 AM	0 responses 19 views 0 reactions	Last Post by SEQadmin2 06-26-2026, 11:10 AM
Whole-Genome Sequencing Traces Faroe Islands Ancestry to a North Atlantic Founder Population by SEQadmin2 Started by SEQadmin2, 06-17-2026, 06:09 AM	0 responses 53 views 0 reactions	Last Post by SEQadmin2 06-17-2026, 06:09 AM

Unconfigured Ad

Getting mapped read count from a BED file - bedtools coverage?

Latest Articles

ad_right_rmr

News