Hi all,
I'm involved in a project involving SNP detection for various candidate genes. I've started by using Maq to align my reads using a mismatch value of 3. For some reason, I've only been able to align around 20% of the reads. I'm wondering if this has something to do with:
1) The format of the quality scores. I'm assuming they are something like solexa, but I just want to double check that this is what Maq "wants". Here is an example before fastq conversion.
@7:1:22:390#0/1
AAATACGATGTAGAAACCACATATTTTGAAACAATATGCAACAACAAACTGTGAATTAAATCAACGCATATGAAA
+7:1:22:390#0/1
ab``\HRZ`\T]`]]aOR_[]][^Q_SZP][YP\W\VWU\LS\Y]_]UUPVNMN]TQYY\MHN^`BBBBBBBBBB
2) The actual quality of the reads. A quick glance at my reads file shows a lot of Bs on the 3' end. I joined this project after the reads were obtained, so I know little about the sample quality. Is there a way to determine average quality?
Overall, I estimate close to 24x coverage. I know Maq should be able to deal with low quality scores, but I'm wondering if this is one of those special cases requiring trimming. Short of acquiring new reads, is there an appropriate way to approach this issue?
I'm involved in a project involving SNP detection for various candidate genes. I've started by using Maq to align my reads using a mismatch value of 3. For some reason, I've only been able to align around 20% of the reads. I'm wondering if this has something to do with:
1) The format of the quality scores. I'm assuming they are something like solexa, but I just want to double check that this is what Maq "wants". Here is an example before fastq conversion.
@7:1:22:390#0/1
AAATACGATGTAGAAACCACATATTTTGAAACAATATGCAACAACAAACTGTGAATTAAATCAACGCATATGAAA
+7:1:22:390#0/1
ab``\HRZ`\T]`]]aOR_[]][^Q_SZP][YP\W\VWU\LS\Y]_]UUPVNMN]TQYY\MHN^`BBBBBBBBBB
2) The actual quality of the reads. A quick glance at my reads file shows a lot of Bs on the 3' end. I joined this project after the reads were obtained, so I know little about the sample quality. Is there a way to determine average quality?
Overall, I estimate close to 24x coverage. I know Maq should be able to deal with low quality scores, but I'm wondering if this is one of those special cases requiring trimming. Short of acquiring new reads, is there an appropriate way to approach this issue?
Comment