Hi All,
I use Bowtie1 (version 1.0.0 for MacOSX)
In order to discard some reads, I mapped reads to multiple reference sequences which I want to remove.
I have a problem that Bowtie gave me fewer aligned reads, when I use more reference sequences.
To be specific....
Total sequences I want to discard are 21 sequences, and there are three different groups of sequences, and each groups have 7 sequences.
Group A: A1,A2,A3,A4,A5,A6,A7. -> similarity:53%~99%, seq length: 1550nt
Group B: B1,B2,B3,B4,B5,B6,B7. -> similarity:49%~99%, seq length: 2900nt
Group C: C1,C2,C3,C4,C5,C6,C7. -> similarity:51%~99%, seq length: 120nt
====> Major targets are A1 and B1
By using major two sequences, A1 & B1, I built a index file, and then did bowtie1.
Its log file reports that:
10.00% reads were reported as aligned reads,
00.01% reads were reported as suppressed reads, and
89.99% reads were reported as failed reads.
After that, I did the same process with all 21 sequences : built a index, ran bowtie1.
And I expected that this result would have more aligned reads than former result. However, it was absolutely wrong!
Latter log file reports that:
00.20% reads were reported as aligned reads,
11.00% reads were reported as suppressed reads, and
88.80% reads were reported as failed reads.
I can not understand the reason why more reference sequences have fewer aligned reads.
At least, it should have more or even reads than former result.
Thankfully, # failed reads to align are similar each other.
I used some options :
bowtie `INDEX` -5 1 -n 0 -n 0 -k 1 -m 1 -l 20 --best --phred33-quals --un `UNMAPPED` -q `INPUT` -S `OUT` 2>> `LOG` -t
Thank you!
Jiyoung
I use Bowtie1 (version 1.0.0 for MacOSX)
In order to discard some reads, I mapped reads to multiple reference sequences which I want to remove.
I have a problem that Bowtie gave me fewer aligned reads, when I use more reference sequences.
To be specific....
Total sequences I want to discard are 21 sequences, and there are three different groups of sequences, and each groups have 7 sequences.
Group A: A1,A2,A3,A4,A5,A6,A7. -> similarity:53%~99%, seq length: 1550nt
Group B: B1,B2,B3,B4,B5,B6,B7. -> similarity:49%~99%, seq length: 2900nt
Group C: C1,C2,C3,C4,C5,C6,C7. -> similarity:51%~99%, seq length: 120nt
====> Major targets are A1 and B1
By using major two sequences, A1 & B1, I built a index file, and then did bowtie1.
Its log file reports that:
10.00% reads were reported as aligned reads,
00.01% reads were reported as suppressed reads, and
89.99% reads were reported as failed reads.
After that, I did the same process with all 21 sequences : built a index, ran bowtie1.
And I expected that this result would have more aligned reads than former result. However, it was absolutely wrong!
Latter log file reports that:
00.20% reads were reported as aligned reads,
11.00% reads were reported as suppressed reads, and
88.80% reads were reported as failed reads.
I can not understand the reason why more reference sequences have fewer aligned reads.
At least, it should have more or even reads than former result.
Thankfully, # failed reads to align are similar each other.
I used some options :
bowtie `INDEX` -5 1 -n 0 -n 0 -k 1 -m 1 -l 20 --best --phred33-quals --un `UNMAPPED` -q `INPUT` -S `OUT` 2>> `LOG` -t
Thank you!
Jiyoung
Comment