Hello,
I noticed that when performing left-normalization on my vcf file (as suggested by Annovar at http://annovar.openbioinformatics.or.../articles/VCF/), the number of variants is reduced:
My starting vcf file (ex1.vcf.gz) has 425 variants.
I first split multi-allelic variants calls into separate lines with the command "bcftools norm -m-both -o ex1.step1.vcf ex1.vcf.gz".
The file produced (ex1.step1.vcf) has 425 variants.
Then I perform the actual left-normalization with the command "bcftools norm -f hg19.fa -o ex1.step2.vcf ex1.step1.vcf".
The file produced (ex1.step2.vcf) has 181 variants.
In fact there are some variants in ex1.step1.vcf that can't be found in ex1.step2.vcf.
I don't understand why the number of variants is reduced when I perform left-normalization, are the variants also filtered when I run these commands? Why?
(I'm using bcftools 1.2)
Thanks in advance
I noticed that when performing left-normalization on my vcf file (as suggested by Annovar at http://annovar.openbioinformatics.or.../articles/VCF/), the number of variants is reduced:
My starting vcf file (ex1.vcf.gz) has 425 variants.
I first split multi-allelic variants calls into separate lines with the command "bcftools norm -m-both -o ex1.step1.vcf ex1.vcf.gz".
The file produced (ex1.step1.vcf) has 425 variants.
Then I perform the actual left-normalization with the command "bcftools norm -f hg19.fa -o ex1.step2.vcf ex1.step1.vcf".
The file produced (ex1.step2.vcf) has 181 variants.
In fact there are some variants in ex1.step1.vcf that can't be found in ex1.step2.vcf.
I don't understand why the number of variants is reduced when I perform left-normalization, are the variants also filtered when I run these commands? Why?
(I'm using bcftools 1.2)
Thanks in advance