Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • thiNGS
    Member
    • Sep 2014
    • 24

    remove duplicate reads 100% sequence identity and genomic coordinates

    Hi everyone. I have to analyze some paired end reads coming from a Illumina MiSeq experiment. What I want to do is removing duplicate reads that not only have the same start-end coordinates but also have 100% sequence identity. Is there any tool that can help me do that? I want to work with BAM files not with FastQ files. Thanks!
  • Brian Bushnell
    Super Moderator
    • Jan 2014
    • 2709

    #2
    If sequences have 100% identity then they should have the same mapping coordinates, so there's no reason to work with bam files in this case. I wrote a program that can do this for fastq, but not for bam:

    dedupe.sh in=reads.fq out=deduped.fq ac=f t=1

    There should be tools that can do so on bam files by sorting by sequence, but I don't know what they are offhand.

    Comment

    Latest Articles

    Collapse

    ad_right_rmr

    Collapse

    News

    Collapse

    Topics Statistics Last Post
    Started by SEQadmin2, 06-05-2026, 10:09 AM
    0 responses
    15 views
    0 reactions
    Last Post SEQadmin2  
    Started by SEQadmin2, 06-04-2026, 08:59 AM
    0 responses
    34 views
    0 reactions
    Last Post SEQadmin2  
    Started by SEQadmin2, 06-02-2026, 12:03 PM
    0 responses
    36 views
    0 reactions
    Last Post SEQadmin2  
    Started by SEQadmin2, 06-02-2026, 11:40 AM
    0 responses
    23 views
    0 reactions
    Last Post SEQadmin2  
    Working...