Hi all,
We recently got Solid whole exome data and I tried to align it with bwa and do the snp calling with GATK.
After resolving the CS/CQ-tag issue by a python script to adapt the bam file for analysis with GATK I now ran into another problem:
Alignment worked well and SNP calling did not show any error messages but: It seems that there is some trouble with the phred quality of the bases. GATK showed me some SNPs with AF=1.00 (so homozygous) which were obviously heterozygous. IGV shows the Phred score for each position and it seems that every base which mathces the reference sequence has phred score 0 whereas the SNPs have a relaitvely high Phred Score (>25).
Is it possible that bwa changes the base quality scores when converting the csfasta and qual files to the bwa double encoded fastq files? Or is there a special option in the GATK for Solid base-call qualities? Am I doing something completely wrong?
Any help is appreciateed...
We recently got Solid whole exome data and I tried to align it with bwa and do the snp calling with GATK.
After resolving the CS/CQ-tag issue by a python script to adapt the bam file for analysis with GATK I now ran into another problem:
Alignment worked well and SNP calling did not show any error messages but: It seems that there is some trouble with the phred quality of the bases. GATK showed me some SNPs with AF=1.00 (so homozygous) which were obviously heterozygous. IGV shows the Phred score for each position and it seems that every base which mathces the reference sequence has phred score 0 whereas the SNPs have a relaitvely high Phred Score (>25).
Is it possible that bwa changes the base quality scores when converting the csfasta and qual files to the bwa double encoded fastq files? Or is there a special option in the GATK for Solid base-call qualities? Am I doing something completely wrong?
Any help is appreciateed...
Comment