Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • count duplicates in bam files

    Hi,

    I would like to count duplicates in bam files. I am comparing two mapping tools and looking at the total counts. Typically, it should be the same
    value. My assumption is that BWA mem counts duplicates.

    What I did:

    1) convert sam -> bam file
    2) sort bam file
    3) use the Picard tool MarkDuplicates.jar
    4) use the BuildBamIndex.jar

    Mapped Reads (CLC) 6,876,285
    Mapped Reads (BWA) 6,375,889
    Unmapped Reads (CLC) 231,927
    Unmapped Reads (BWA) 7,367,30
    Total count (CLC) 7,108,212
    Total count (BWA) 7,112,619

    What would be the next step ? I ve tried to use the .bai files ... but do they have information about the number of duplicates ?

    Do you have any suggestions ?

    Best,
    Flo

  • #2
    If you are just looking to get the total number of duplicates you could run Qualimap on your bam files. You will get all sorts of additional info as well.

    This thread has some other suggestions: http://seqanswers.com/forums/showthread.php?t=23493

    Comment


    • #3
      Doesn't the metrics file produced by MarkDuplicates.jar already contain the information the duplicate counts?

      Comment


      • #4
        Doh! A file name is required for Metrics file so @Flo89 must have provided it when running MarkDuplicates.

        Comment


        • #5
          Hi guys,

          Thank you very much. I am not familiar with this mapping stuff. So, I guess it was an easy question for you guys.

          Best,
          Flo

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Advanced Methods for the Detection of Infectious Disease
            by seqadmin




            The recent pandemic caused worldwide health, economic, and social disruptions with its reverberations still felt today. A key takeaway from this event is the need for accurate and accessible tools for detecting and tracking infectious diseases. Timely identification is essential for early intervention, managing outbreaks, and preventing their spread. This article reviews several valuable tools employed in the detection and surveillance of infectious diseases.
            ...
            11-27-2023, 01:15 PM
          • seqadmin
            Strategies for Investigating the Microbiome
            by seqadmin




            Microbiome research has led to the discovery of important connections to human and environmental health. Sequencing has become a core investigational tool in microbiome research, a subject that we covered during a recent webinar. Our expert speakers shared a number of advancements including improved experimental workflows, research involving transmission dynamics, and invaluable analysis resources. This article recaps their informative presentations, offering insights...
            11-09-2023, 07:02 AM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, Yesterday, 09:55 AM
          0 responses
          11 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 11-30-2023, 10:48 AM
          0 responses
          17 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 11-29-2023, 08:26 AM
          0 responses
          14 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 11-29-2023, 08:12 AM
          0 responses
          14 views
          0 likes
          Last Post seqadmin  
          Working...
          X