Dear all,
I have a question regarding how bowtie 2 handles the read quality scores in FASTQ files.
If i understand correctly, a mismatch between a read and a genome is penalized severely if the base of the read has a high quality score and gets penalized mildly if the base has a poor sequencing quality (as stated in the FASTQ file). In bowtie, the threshold for reporting a hit is however based on simply the length L of the read (threshold = L * a + b). Doesn't this lead to poor quality reads being reported preferentially over high quality reads?
Many thanks for your insight!
I have a question regarding how bowtie 2 handles the read quality scores in FASTQ files.
If i understand correctly, a mismatch between a read and a genome is penalized severely if the base of the read has a high quality score and gets penalized mildly if the base has a poor sequencing quality (as stated in the FASTQ file). In bowtie, the threshold for reporting a hit is however based on simply the length L of the read (threshold = L * a + b). Doesn't this lead to poor quality reads being reported preferentially over high quality reads?
Many thanks for your insight!