I have reference mapped paired end illumina reads and called variants using BWA and Samtools respectively. The resulting vcf was treated to remove high coverage SNPs with
and then filtered for low quality SNPs using awk
I graphed the distribution of SNP quality and observed a huge peak at 222., I repeated it with other samples and observed the same peak. Any clues as to why I may be seeing this?
Code:
vcfutils.pl varFilter -D30
Code:
'($3=="*"&&$6>=50)||($3!="*"&&$6>=20)'