Hi everyone. I have to analyze some paired end reads coming from a Illumina MiSeq experiment. What I want to do is removing duplicate reads that not only have the same start-end coordinates but also have 100% sequence identity. Is there any tool that can help me do that? I want to work with BAM files not with FastQ files. Thanks!
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
If sequences have 100% identity then they should have the same mapping coordinates, so there's no reason to work with bam files in this case. I wrote a program that can do this for fastq, but not for bam:
dedupe.sh in=reads.fq out=deduped.fq ac=f t=1
There should be tools that can do so on bam files by sorting by sequence, but I don't know what they are offhand.
Latest Articles
Collapse
-
by seqadmin
The field of immunogenetics explores how genetic variations influence immune responses and susceptibility to disease. In a recent SEQanswers webinar, Oscar Rodriguez, Ph.D., Postdoctoral Researcher at the University of Louisville, and Ruben Martínez Barricarte, Ph.D., Assistant Professor of Medicine at Vanderbilt University, shared recent advancements in immunogenetics. This article discusses their research on genetic variation in antibody loci, antibody production processes,...-
Channel: Articles
11-06-2024, 07:24 PM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, 11-22-2024, 07:36 AM
|
0 responses
61 views
0 likes
|
Last Post
by seqadmin
11-22-2024, 07:36 AM
|
||
Started by seqadmin, 11-22-2024, 07:04 AM
|
0 responses
81 views
0 likes
|
Last Post
by seqadmin
11-22-2024, 07:04 AM
|
||
Started by seqadmin, 11-21-2024, 09:19 AM
|
0 responses
76 views
0 likes
|
Last Post
by seqadmin
11-21-2024, 09:19 AM
|
||
Started by seqadmin, 11-08-2024, 11:09 AM
|
0 responses
320 views
0 likes
|
Last Post
by seqadmin
11-08-2024, 11:09 AM
|
Comment