I am analyzing a targeted enrichment of a Hapmap blend. I have 4 hapmap samples mixed in the ratio: 74:20:5:1
I am looking to detect variants at low frequencies (upto 0.5%).
I am using GATK for the SNP analysis - I carried out sample level local realignment around indels and base quality score recalibration.
With Unified Genotyper, I am getting a very good expected: actual ratio for SNPs above 10% expected frequency. However, I am not picking up any of the SNPs in the 0.5 to 10% range. I do see a lot of these SNPs (on CLC bio Genomics workbench) and pick them up the CLC bio SNP caller (along with 100's of false positives). I understand a lot of them may be poor quality or may have been rejected by GATK due to strand bias or other problems. But the fact that not one of the 20 expected calls were picked up makes me wonder if I am doing something wrong.
The coverage in these positions is very high (~2000). Sequencing was done on Illumina. I used the default quality score settings on GATK. Is there any parameter that I can work around to increase the sensitivity on this particular set and pick up these low frequency SNPs ?
Any help would be appreciated.
Thanks,
Preethi
I am looking to detect variants at low frequencies (upto 0.5%).
I am using GATK for the SNP analysis - I carried out sample level local realignment around indels and base quality score recalibration.
With Unified Genotyper, I am getting a very good expected: actual ratio for SNPs above 10% expected frequency. However, I am not picking up any of the SNPs in the 0.5 to 10% range. I do see a lot of these SNPs (on CLC bio Genomics workbench) and pick them up the CLC bio SNP caller (along with 100's of false positives). I understand a lot of them may be poor quality or may have been rejected by GATK due to strand bias or other problems. But the fact that not one of the 20 expected calls were picked up makes me wonder if I am doing something wrong.
The coverage in these positions is very high (~2000). Sequencing was done on Illumina. I used the default quality score settings on GATK. Is there any parameter that I can work around to increase the sensitivity on this particular set and pick up these low frequency SNPs ?
Any help would be appreciated.
Thanks,
Preethi
Comment