Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • morning latte
    replied
    Dear Brian Bushnell,

    Thanks a lot for the detailed explanation on this. Everything is now very clear.

    Leave a comment:


  • Brian Bushnell
    replied
    Wow - you had very uneven coverage.

    "Percent of scaffolds with any coverage" means that - well... let's assume you had a human reference genome, which has 25 chromosomes: 1-22, X, Y and M.

    In that case, if each of those 25 sequences had at least one read hit, then the percentage of scaffolds with coverage would be 100%. You can get more details in the per-scaffold coverage file to see what percent of each scaffold was covered... in general, for a complete genome, "scaffold" means "chromosome".

    0.32% refers to the percent of bases across the entire genome that had any coverage, and you can consult the histogram for more details. But essentially, (100% - 0.32%) of the genome had zero coverage. I assume this was a ChipSeq experiment or similar where the assumption is that 99.9% of the coverage falls upon 0.1% of the genome.

    Leave a comment:


  • morning latte
    replied
    Dear Brian Bushnell,

    Thanks a lot for the suggestion. I just ran BBMap on one of my sam files and the summary output looks like below.

    Average coverage: 9.75
    Percent scaffolds with any coverage: 100.00
    Percent of reference bases covered: 0.32

    I guess only 0.32 proportion of the reference genome was covered by reads at any coverage. Then what does " Percent scaffolds with any coverage" mean? Thanks for your help in advance.

    Leave a comment:


  • Brian Bushnell
    replied
    And there's also...

    The BBMap suite's pileup program! It takes sam or bam, sorted or unsorted.

    pileup.sh in=mapped.sam out=stats.txt hist=histogram.txt

    stats.txt will contain the average depth and percent covered of each reference sequence; the histogram will contain the exact number of bases with a each coverage level. You can also get per-base coverage or binned coverage if you want to plot the coverage. It also generates median and standard deviation, and so forth.

    It's also possible to generate coverage directly from BBMap, without an intermediate sam file, like this:

    bbmap.sh in=reads.fq ref=reference.fasta nodisk covstats=stats.txt covhist=histogram.txt

    We use this a lot in situations where all you care about is coverage distributions, which is somewhat common in metagenome assemblies. It also supports most of the flags that pileup.sh supports, though the syntax is slightly different to prevent collisions. In each case you can see all the possible flags by running the shellscript with no arguments.

    P.S. I put some work into it last week and it is now over 3x as fast as it used to be, and it used to be pretty fast!
    Last edited by Brian Bushnell; 01-29-2015, 06:52 PM.

    Leave a comment:


  • GenoMax
    replied
    Qualimap generates detailed stats for BAM files.

    Leave a comment:


  • morning latte
    replied
    Thanks Sergioo. Unfortunately, I don't have a CLC Genomics Workbench around me. Could you direct me an alternative way if you have any idea? Thanks!

    Leave a comment:


  • Sergioo
    replied
    Originally posted by morning latte View Post
    Hello,

    I have seen many ways to get the depth of reads but haven't found a way to get the coverage of genome length (breadth or width). Could anyone suggest an advice on this? Thanks.
    Hi,
    If you have a CLC Genomics Workbench around, you can generate what they call a "detailed mapping report" of your reads-Reference genome. It will show the fraction of genome covered by your reads.
    Hope it helps
    Cheers

    Leave a comment:


  • morning latte
    started a topic Length of genome covered by reads by mapping

    Length of genome covered by reads by mapping

    Hello,

    I have generated SAM and BAM files after mapping my Illumina reads to a reference genome. Now I want to know how much of the reference genome is covered (aligned/mapped) by reads (e.g. 50% of the reference genome is covered by reads). I have seen many ways to get the depth of reads but haven't found a way to get the coverage of genome length (breadth or width). Could anyone suggest an advice on this? Thanks.
    Last edited by morning latte; 01-29-2015, 05:46 PM.

Latest Articles

Collapse

  • seqadmin
    Recent Advances in Sequencing Technologies
    by seqadmin



    Innovations in next-generation sequencing technologies and techniques are driving more precise and comprehensive exploration of complex biological systems. Current advancements include improved accessibility for long-read sequencing and significant progress in single-cell and 3D genomics. This article explores some of the most impactful developments in the field over the past year.

    Long-Read Sequencing
    Long-read sequencing has seen remarkable advancements,...
    12-02-2024, 01:49 PM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, Yesterday, 07:41 AM
0 responses
6 views
0 likes
Last Post seqadmin  
Started by seqadmin, 12-11-2024, 07:45 AM
0 responses
11 views
0 likes
Last Post seqadmin  
Started by seqadmin, 12-10-2024, 07:59 AM
0 responses
12 views
0 likes
Last Post seqadmin  
Started by seqadmin, 12-09-2024, 08:22 AM
0 responses
9 views
0 likes
Last Post seqadmin  
Working...
X