Seqanswers Leaderboard Ad

Collapse
X
Collapse
+ More Options
Posts
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • nk
    Member
    • Apr 2012
    • 11

    Calculating max coverage per gene from BAM + GFF

    I have a BAM file with alignments and a GFF file with non-overlapping gene annotation. Using these, I would like to find out what the highest coverage for each gene was.

    In other words, I am looking for something like HTSeq-count (http://www-huber.embl.de/users/ander...doc/count.html), except I want the max coverage of reads to be returned instead of the total count of reads.

    Is there any tool that can do something like this or am I stuck writing my own script?
  • dpryan
    Devon Ryan
    • Jul 2011
    • 3478

    #2
    Do you mean that you want the maximum per-base coverage in a given gene? I'm not familiar with any tool specifically for that, but the simplest route to making one would probably involve parsing the output from "samtools mpileup", which can be told to only output regions of interest.

    Comment

    • nk
      Member
      • Apr 2012
      • 11

      #3
      Yes, exactly. Thanks for the suggestion to use mpileup - isn't that usually just used for SNP calling though?

      Comment

      • dpryan
        Devon Ryan
        • Jul 2011
        • 3478

        #4
        Yeah, but it also gives the per-base depth (the rest can just be ignored). The benefit of this is that it makes filtering by MAPQ and base phred score easy simple, since samtools will do that part for you. Just make sure to adjust the -d parameter to a large value!

        Comment

        • nk
          Member
          • Apr 2012
          • 11

          #5
          In case anybody ever stumbles upon this - my solution ended up being to use bedtools genomecov to calculate the per-base genome coverage, bedtools intersect to overlap the genome coverage with my gff file and then awk to find the max coverage per feature.

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Pathogen Surveillance with Advanced Genomic Tools
            by seqadmin




            The COVID-19 pandemic highlighted the need for proactive pathogen surveillance systems. As ongoing threats like avian influenza and newly emerging infections continue to pose risks, researchers are working to improve how quickly and accurately pathogens can be identified and tracked. In a recent SEQanswers webinar, two experts discussed how next-generation sequencing (NGS) and machine learning are shaping efforts to monitor viral variation and trace the origins of infectious...
            03-24-2025, 11:48 AM
          • seqadmin
            New Genomics Tools and Methods Shared at AGBT 2025
            by seqadmin


            This year’s Advances in Genome Biology and Technology (AGBT) General Meeting commemorated the 25th anniversary of the event at its original venue on Marco Island, Florida. While this year’s event didn’t include high-profile musical performances, the industry announcements and cutting-edge research still drew the attention of leading scientists.

            The Headliner
            The biggest announcement was Roche stepping back into the sequencing platform market. In the years since...
            03-03-2025, 01:39 PM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, Today, 12:59 PM
          0 responses
          6 views
          0 reactions
          Last Post seqadmin  
          Started by seqadmin, Yesterday, 10:17 AM
          0 responses
          8 views
          0 reactions
          Last Post seqadmin  
          Started by seqadmin, 03-20-2025, 05:03 AM
          0 responses
          49 views
          0 reactions
          Last Post seqadmin  
          Started by seqadmin, 03-19-2025, 07:27 AM
          0 responses
          60 views
          0 reactions
          Last Post seqadmin  
          Working...