Seqanswers Leaderboard Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • m_elena_bioinfo
    Member
    • Oct 2009
    • 99

    samtools rmdup PE

    Hi users,
    someone knows what is the difference between:
    >samtools rmdup
    and
    >samtools rmdup -S.

    They remove PCR duplicates in paired end data, and the tutorial says that -S option it's useful to treat PE reads as single-end data.

    I run both the script and, starting from a 750Mb bam file, I generated 650Mb bam file (without using -S option) and 350Mb bam file (using -S option).

    Thanx a lot!
    ME
  • CowGirl
    Junior Member
    • Mar 2013
    • 9

    #2
    RmDup with the -S option

    I have a similar question - what exactly does "-S" do with paired end reads?

    Comment

    • swbarnes2
      Senior Member
      • May 2008
      • 910

      #3
      If you have one pair of reads where read 1 starts at position 100, and the other end starts at position 200, and a second pair of reads where read 1 starts at position 100, and read 2 starts at position 250, those came from different fragments of DNA. You can tell because the read 2 start is different, even though the read 1 start is the same.

      When treating the reads as paired end, none of those reads should be deleted as PCR duplicates.

      However, if you ran rmdup -S, the software will not check to see if read 2 has a different start coordinate, so one of those read 1 reads will be treated as a duplicate, and deleted.

      Comment

      • CowGirl
        Junior Member
        • Mar 2013
        • 9

        #4
        Hi swbarnes2, thanks for the reply. That makes more sense than all the answers I've found as of yet!

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Pathogen Surveillance with Advanced Genomic Tools
          by seqadmin




          The COVID-19 pandemic highlighted the need for proactive pathogen surveillance systems. As ongoing threats like avian influenza and newly emerging infections continue to pose risks, researchers are working to improve how quickly and accurately pathogens can be identified and tracked. In a recent SEQanswers webinar, two experts discussed how next-generation sequencing (NGS) and machine learning are shaping efforts to monitor viral variation and trace the origins of infectious...
          03-24-2025, 11:48 AM
        • seqadmin
          New Genomics Tools and Methods Shared at AGBT 2025
          by seqadmin


          This year’s Advances in Genome Biology and Technology (AGBT) General Meeting commemorated the 25th anniversary of the event at its original venue on Marco Island, Florida. While this year’s event didn’t include high-profile musical performances, the industry announcements and cutting-edge research still drew the attention of leading scientists.

          The Headliner
          The biggest announcement was Roche stepping back into the sequencing platform market. In the years since...
          03-03-2025, 01:39 PM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, 03-20-2025, 05:03 AM
        0 responses
        49 views
        0 reactions
        Last Post seqadmin  
        Started by seqadmin, 03-19-2025, 07:27 AM
        0 responses
        57 views
        0 reactions
        Last Post seqadmin  
        Started by seqadmin, 03-18-2025, 12:50 PM
        0 responses
        49 views
        0 reactions
        Last Post seqadmin  
        Started by seqadmin, 03-03-2025, 01:15 PM
        0 responses
        200 views
        0 reactions
        Last Post seqadmin  
        Working...