Hi,
I'm having problems understanding a GATK output VCF. I have read the VCF standard, but I'm obviously missing something.
I /think/ I understand how SNPs and short indels are represented, but clearly I do not. Below is an excerpt that illustrates sites which I do not understand. I suspect it may be something to do with GATK quality filters that I'm not understanding...
The excerpt below was generated using
GATK -l INFO -I my.bam -R my.fa -T UnifiedGenotyper -S LENIENT -nt 8 --heterozygosity 0.1 -o test.vcf --genotype_likelihoods_model BOTH --min_base_quality_score 10 --output_mode EMIT_ALL_SITES -ploidy 2
Thanks!
Darren
-------------------------------------------------------
Line 1 is a SNP
Lines 14 and 15 are an indel that I do understand
Lines 19 and 20 I do /not/ understand
Lines 21 and 22 I do /not/ understand
---------------------------------------------------
I'm having problems understanding a GATK output VCF. I have read the VCF standard, but I'm obviously missing something.
I /think/ I understand how SNPs and short indels are represented, but clearly I do not. Below is an excerpt that illustrates sites which I do not understand. I suspect it may be something to do with GATK quality filters that I'm not understanding...
The excerpt below was generated using
GATK -l INFO -I my.bam -R my.fa -T UnifiedGenotyper -S LENIENT -nt 8 --heterozygosity 0.1 -o test.vcf --genotype_likelihoods_model BOTH --min_base_quality_score 10 --output_mode EMIT_ALL_SITES -ploidy 2
Thanks!
Darren
-------------------------------------------------------
Code:
CH1 225 . T G 12.71 LowQual AC=1;AF=0.500;AN=2;BaseQRankSum=1.978;DP=59;Dels=0.03;FS=0.000;HaplotypeScore=10.2840;MLEAC=1;MLEAF=0.500;MQ=70.25;MQ0=8;MQRankSum=-5.349;QD=0.22;ReadPosRankSum=-3.188 GT:AD:DP:GQ:PL 0/1:41,16:55:20:20,0,1435 CH1 226 . T . 121.53 . AN=2;DP=59;MQ=70.25;MQ0=8 GT:DP 0/0:43 CH1 227 . A . 121.53 . AN=2;DP=59;MQ=70.25;MQ0=8 GT:DP 0/0:43 CH1 228 . T . 121.53 . AN=2;DP=59;MQ=70.25;MQ0=8 GT:DP 0/0:43 CH1 229 . A . 115.53 . AN=2;DP=57;MQ=69.66;MQ0=8 GT:DP 0/0:38 CH1 230 . C . 115.53 . AN=2;DP=57;MQ=69.66;MQ0=8 GT:DP 0/0:38 CH1 231 . T . 115.53 . AN=2;DP=57;MQ=69.66;MQ0=8 GT:DP 0/0:36 CH1 232 . G . 115.53 . AN=2;DP=57;MQ=69.66;MQ0=8 GT:DP 0/0:36 CH1 233 . C . 115.53 . AN=2;DP=57;MQ=69.66;MQ0=8 GT:DP 0/0:37 CH1 234 . A . 139.53 . AN=2;DP=70;MQ=59.20;MQ0=14 GT:DP 0/0:63 CH1 235 . A . 175.53 . AN=2;DP=84;MQ=51.67;MQ0=15 GT:DP 0/0:79 CH1 236 . A . 175.53 . AN=2;DP=84;MQ=51.67;MQ0=15 GT:DP 0/0:79 CH1 237 . T . 175.53 . AN=2;DP=85;MQ=51.37;MQ0=16 GT:DP 0/0:80 CH1 238 . A . 175.53 . AN=2;DP=102;MQ=46.90;MQ0=28 GT:DP 0/0:97 CH1 238 . A AGAAAGAAAGCTTGTA 83.73 . AC=1;AF=0.500;AN=2;BaseQRankSum=6.172;DP=102;FS=0.000;MLEAC=1;MLEAF=0.500;MQ=46.90;MQ0=0;MQRankSum=-6.190;QD=0.05;ReadPosRankSum=-5.733 GT:AD:DP:GQ:PL 0/1:27,25:57:99:121,0,4853 CH1 239 . A . 175.53 . AN=2;DP=102;MQ=46.90;MQ0=28 GT:DP 0/0:101 CH1 240 . T . 175.53 . AN=2;DP=102;MQ=46.90;MQ0=28 GT:DP 0/0:98 CH1 241 . A . 169.53 . AN=2;DP=108;MQ=44.14;MQ0=29 GT:DP 0/0:107 CH1 242 . T . 169.53 . AN=2;DP=109;MQ=43.94;MQ0=29 GT:DP 0/0:103 CH1 242 . T . 118.27 . AN=2;DP=109;MQ=43.94;MQ0=29 GT:AD:DP 0/0:27:55 CH1 243 . C . 172.53 . AN=2;DP=110;MQ=43.76;MQ0=29 GT:DP 0/0:108 CH1 243 . CTTTT . 118.27 . AN=2;DP=110;MQ=43.76;MQ0=29 GT:AD:DP 0/0:27:56 CH1 244 . T . 91.53 . AN=2;DP=110;MQ=43.76;MQ0=29 GT:DP 0/0:61 CH1 245 . T . 91.53 . AN=2;DP=110;MQ=43.76;MQ0=29 GT:DP 0/0:53 CH1 246 . T . 73.53 . AN=2;DP=110;MQ=43.76;MQ0=29 GT:DP 0/0:41 CH1 247 . T . 91.53 . AN=2;DP=110;MQ=43.76;MQ0=29 GT:DP 0/0:46 CH1 248 . A . 172.53 . AN=2;DP=116;MQ=42.61;MQ0=31 GT:DP 0/0:100 CH1 249 . A . 172.53 . AN=2;DP=116;MQ=42.61;MQ0=31 GT:DP 0/0:100 CH1 250 . T . 172.53 . AN=2;DP=117;MQ=42.43;MQ0=32 GT:DP 0/0:101 CH1 251 . T . 169.53 . AN=2;DP=117;MQ=42.43;MQ0=32 GT:DP 0/0:96 CH1 251 . T . 118.27 . AN=2;DP=117;MQ=42.43;MQ0=32 GT:AD:DP 0/0:27:56 CH1 252 . C . 172.53 . AN=2;DP=117;MQ=42.43;MQ0=32 GT:DP 0/0:113 CH1 253 . C . 172.53 . AN=2;DP=117;MQ=42.43;MQ0=32 GT:DP 0/0:110 CH1 254 . T . 172.53 . AN=2;DP=117;MQ=42.43;MQ0=32 GT:DP 0/0:111 CH1 255 . T . 172.53 . AN=2;DP=117;MQ=42.43;MQ0=32 GT:DP 0/0:111 CH1 256 . T . 172.53 . AN=2;DP=117;MQ=42.43;MQ0=32 GT:DP 0/0:111
Lines 14 and 15 are an indel that I do understand
Lines 19 and 20 I do /not/ understand
Lines 21 and 22 I do /not/ understand
---------------------------------------------------
Comment