Hi,
Question: What is the best way to remove reads that map to multiple locations?
What I have done so far:
I have a 76x2 paired end data which I mapped to hg18 using BWA.
I want to get rid of reads that map to multiple locations, so I was
told to use the cutoff of -q 1 in samtools view;
however, to my surprise, I have reads with non-zero quality but appear to map to alternative locations with the same score.
For example, this read maps to chr1 and also to chrM
XA tag is the BWA optional field and if I understand correctly it means that this sequence also mapped to chrM with 100% matches;
so it appears that it is not sufficient to use the -q 1 as the flag to remove multiple hits.
Can someone tell me if I am correct; and also what is the best way to get rid of reads that align non-uniquely on the genome.
thanks
Christoph
Question: What is the best way to remove reads that map to multiple locations?
What I have done so far:
I have a 76x2 paired end data which I mapped to hg18 using BWA.
I want to get rid of reads that map to multiple locations, so I was
told to use the cutoff of -q 1 in samtools view;
however, to my surprise, I have reads with non-zero quality but appear to map to alternative locations with the same score.
For example, this read maps to chr1 and also to chrM
Code:
PF_Y_612GW_I581:100617:1:21:8990:3211#0 89 chr1 557906 23 76M * 0 0 CCCATGGCCTCCATGACTTTTTCAAAAAGATATTAGAAAAACCATTTCATAACTTTGTCAAAGTTAAATTATAGGC HHHHDHH>HBHHHHHHGHHHH>HGHHHHHHHHGFHHHHHFHGGDGGDHHHHFHFHHHHHHHHFHHHEHHHGGDGGG XT:A:U NM:i:0 SM:i:23 AM:i:0 X0:i:1 X1:i:1 XM:i:0 XO:i:0 XG:i:0 MD:Z:76 XA:Z:chrM,-7493,76M,1;
so it appears that it is not sufficient to use the -q 1 as the flag to remove multiple hits.
Can someone tell me if I am correct; and also what is the best way to get rid of reads that align non-uniquely on the genome.
thanks
Christoph
Comment