Hi, all.
I am using bowtie for mapping color-space reads.
While checking the alignments, I found that some read sequences reported in output file do not match with original read (in base-space).
For example, read A,
was mapped as below:
Note the reported read sequence is different from original read in base-space.
So I checked the bowtie index files.
Assuming the above sequence was double-encoded, I decoded it back and look where the read was mapped.
Ignoring first color base, '10130203200100320123211' matches the reference in color space within two mismatches (-v 2). But apparently this is wrong.
I tested with other reads, and some reads were mapped correctly and some were not. Did I get something wrong, or is this a bug in bowtie?
Thanks in advance!
I am using bowtie for mapping color-space reads.
While checking the alignments, I found that some read sequences reported in output file do not match with original read (in base-space).
For example, read A,
Code:
T010130203200100320123211 (color-space) TGGTAAGGCTTTGGGCTTGATCAC (base-space)
Code:
>bowtie -v 2 -m 1 --best -t -p 10 -y -C hsa_miRBase17_hairpin_c -c T010130203200100320123211 0 + hsa-mir-191 16 AACGGAATCCCAAATCCAGCTG qqqqqqqqqqqqqqqqqqqqqq 0 14:A>T,15:G>C
So I checked the bowtie index files.
Code:
>bowtie-inspect -e hsa_miRBase17_hairpin_c | grep -A 1 hsa-mir-191 >hsa-mir-191 TATGCAGCCGTTAATCACTAGATGAACAAAGTCGTGCCACCGGGACGGGTCTAGACGTGCTTTGACAGTAAGTCGAAAGCTGGGGAGCTAG
Code:
TATGCAGCCGTTAATCACTAGATGAACAAAGTCGTGCCACCGGGACGGGTCTAGACGTGCTTTGACAGTAAGTCGAAAGCTGGGGAGCTAG 3032102112330031013020320010002312321101122201222313020123213332010230023120002132222021302 CGGCTGGACAGCGGGCAACGGAATCCCAAAAGCAGCTGTTGTCTCCAGAGCATTCCAGCTGCGCTTGGATTTCGTCCCCTGCTCTCCTGCCT T010130203200100320123211 (I attached 'C' in the beginning so that decoding start correctly)
I tested with other reads, and some reads were mapped correctly and some were not. Did I get something wrong, or is this a bug in bowtie?
Thanks in advance!
Comment