Hopefully someone can point out what I'm doing wrong.
I have called variants for 4 chickens separately in GATK using the Haplotype caller, and the VCF files have later been combined using the CombineVariants tool. The resulting VCF file has then been converted to PED format using the VariantsToBinaryPed tool. No errors were generated at any point.
However, when using the binary file set in plink there appears to be an issue. For instance, when calculating allele frequency for 77k SNPs the MAF is NA for all SNPs, as the NCHROBS is 0 for all SNPs, e.g.
Looking at the VCF file that was fed into the VariantsToBinaryPed tool indicates that these SNPs are present and genotyped, e.g.
Can anyone advise?
Cheers
D
I have called variants for 4 chickens separately in GATK using the Haplotype caller, and the VCF files have later been combined using the CombineVariants tool. The resulting VCF file has then been converted to PED format using the VariantsToBinaryPed tool. No errors were generated at any point.
However, when using the binary file set in plink there appears to be an issue. For instance, when calculating allele frequency for 77k SNPs the MAF is NA for all SNPs, as the NCHROBS is 0 for all SNPs, e.g.
Code:
CHR SNP A1 A2 MAF NCHROBS 24 Var-24-1055 A T NA 0 24 Var-24-1137 T A NA 0 24 Var-24-1263 G A NA 0
Code:
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT HA1A22AHB1A16A JA1A17A JB1A25B 24 1055 . T A 484.77 PASS AC=8;AF=1.00;AN=8;DP=129;FS=0.000;MLEAC=2;MLEAF=1.00;MQ0=0;set=Intersection GT:AD:DP:GQ:PL 1/1:0,26:26:54:513,54,0 1/1:0,38:38:99:1138,117,0 1/1:0,35:35:99:977,108,0 1/1:0,30:30:93:890,93,0 24 1137 . A T 181.77 PASS AC=2;AF=0.500;AN=4;DP=65;MLEAC=1;MLEAF=0.500;MQ0=0;set=HA1A22A-JB1A25B GT:AD:DP:GQ:PL 0/1:16,10:26:99:210,0,208 ./. ./. 0/1:19,20:39:99:372,0,261 24 1263 . A G 170.77 PASS AC=2;AF=0.500;AN=4;DP=38;MLEAC=1;MLEAF=0.500;MQ0=0;set=HA1A22A-JB1A25B GT:AD:DP:GQ:PL 0/1:14,10:24:99:199,0,172 ./. ./. 0/1:6,8:14:89:122,0,89
Cheers
D
Comment