Dear Brian Bushnell,
Thanks a lot for the detailed explanation on this. Everything is now very clear.
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
Wow - you had very uneven coverage.
"Percent of scaffolds with any coverage" means that - well... let's assume you had a human reference genome, which has 25 chromosomes: 1-22, X, Y and M.
In that case, if each of those 25 sequences had at least one read hit, then the percentage of scaffolds with coverage would be 100%. You can get more details in the per-scaffold coverage file to see what percent of each scaffold was covered... in general, for a complete genome, "scaffold" means "chromosome".
0.32% refers to the percent of bases across the entire genome that had any coverage, and you can consult the histogram for more details. But essentially, (100% - 0.32%) of the genome had zero coverage. I assume this was a ChipSeq experiment or similar where the assumption is that 99.9% of the coverage falls upon 0.1% of the genome.
Leave a comment:
-
Dear Brian Bushnell,
Thanks a lot for the suggestion. I just ran BBMap on one of my sam files and the summary output looks like below.
Average coverage: 9.75
Percent scaffolds with any coverage: 100.00
Percent of reference bases covered: 0.32
I guess only 0.32 proportion of the reference genome was covered by reads at any coverage. Then what does " Percent scaffolds with any coverage" mean? Thanks for your help in advance.
Leave a comment:
-
And there's also...
The BBMap suite's pileup program! It takes sam or bam, sorted or unsorted.
pileup.sh in=mapped.sam out=stats.txt hist=histogram.txt
stats.txt will contain the average depth and percent covered of each reference sequence; the histogram will contain the exact number of bases with a each coverage level. You can also get per-base coverage or binned coverage if you want to plot the coverage. It also generates median and standard deviation, and so forth.
It's also possible to generate coverage directly from BBMap, without an intermediate sam file, like this:
bbmap.sh in=reads.fq ref=reference.fasta nodisk covstats=stats.txt covhist=histogram.txt
We use this a lot in situations where all you care about is coverage distributions, which is somewhat common in metagenome assemblies. It also supports most of the flags that pileup.sh supports, though the syntax is slightly different to prevent collisions. In each case you can see all the possible flags by running the shellscript with no arguments.
P.S. I put some work into it last week and it is now over 3x as fast as it used to be, and it used to be pretty fast!Last edited by Brian Bushnell; 01-29-2015, 06:52 PM.
Leave a comment:
-
Thanks Sergioo. Unfortunately, I don't have a CLC Genomics Workbench around me. Could you direct me an alternative way if you have any idea? Thanks!
Leave a comment:
-
Originally posted by morning latte View PostHello,
I have seen many ways to get the depth of reads but haven't found a way to get the coverage of genome length (breadth or width). Could anyone suggest an advice on this? Thanks.
If you have a CLC Genomics Workbench around, you can generate what they call a "detailed mapping report" of your reads-Reference genome. It will show the fraction of genome covered by your reads.
Hope it helps
Cheers
Leave a comment:
-
Length of genome covered by reads by mapping
Hello,
I have generated SAM and BAM files after mapping my Illumina reads to a reference genome. Now I want to know how much of the reference genome is covered (aligned/mapped) by reads (e.g. 50% of the reference genome is covered by reads). I have seen many ways to get the depth of reads but haven't found a way to get the coverage of genome length (breadth or width). Could anyone suggest an advice on this? Thanks.Last edited by morning latte; 01-29-2015, 05:46 PM.Tags: None
Latest Articles
Collapse
-
by seqadmin
Innovations in next-generation sequencing technologies and techniques are driving more precise and comprehensive exploration of complex biological systems. Current advancements include improved accessibility for long-read sequencing and significant progress in single-cell and 3D genomics. This article explores some of the most impactful developments in the field over the past year.
Long-Read Sequencing
Long-read sequencing has seen remarkable advancements,...-
Channel: Articles
12-02-2024, 01:49 PM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, Yesterday, 07:41 AM
|
0 responses
6 views
0 likes
|
Last Post
by seqadmin
Yesterday, 07:41 AM
|
||
Started by seqadmin, 12-11-2024, 07:45 AM
|
0 responses
11 views
0 likes
|
Last Post
by seqadmin
12-11-2024, 07:45 AM
|
||
Started by seqadmin, 12-10-2024, 07:59 AM
|
0 responses
12 views
0 likes
|
Last Post
by seqadmin
12-10-2024, 07:59 AM
|
||
Newborn Genomic Screening Shows Promise in Reducing Infant Mortality and Hospitalization
by seqadmin
Started by seqadmin, 12-09-2024, 08:22 AM
|
0 responses
9 views
0 likes
|
Last Post
by seqadmin
12-09-2024, 08:22 AM
|
Leave a comment: