Hello,
This is both observation and question does anybody know (Varscan developers?) is this really a bug or something else.
By using latest Varscan 2.2.11 and mpileup2snp command. Mpileup file from samtools with phred+33 scale quality values produces following VCF file:
##fileformat=VCFv4.0
##source=VarScan2
##INFO=<ID=DP,Number=1,Type=Integer,Description="Total Depth">
##FILTER=<ID=str10,Description="Less than 10% or more than 90% of variant supporting reads on one strand">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##FORMAT=<ID=GQ,Number=1,Type=Integer,Description="Genotype Quality">
##FORMAT=<ID=DP,Number=1,Type=Integer,Description="Read Depth">
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT Sample1
1 565286 C T . PASS DP=13 GT:GQ
P 1/1:7:13
1 569492 T C . PASS DP=45 GT:GQ
P 1/1:26:45
There are two problems:
i) First of all ID column is empty, and I mean that even though header line has "ID" the following lines do not contain that at all => messes up mandatory 8 fields of VCF file => does not work in combination with Annovar (convert2annovar.pl). I managed to get convert2annovar.pl to accept the vcf file from Varscan only by running it through following (basicly add "null" to each row of ID column):
perl -nle 'if ($_ !~ /^#/){@tmp=split(/\t/,$_);$,="\t";print @tmp[0..1],"null",@tmp[2..(scalar(@tmp)-1)]} else {print}' result.vcf > result_fixed.vcf
Does anybody know some other solution or reason for this output?
ii) QUAL column contains dot for every variant. Shouldn't there be a number? (This doesn't seem bother Annovar, but it is potentially problematic bug(?)
This is both observation and question does anybody know (Varscan developers?) is this really a bug or something else.
By using latest Varscan 2.2.11 and mpileup2snp command. Mpileup file from samtools with phred+33 scale quality values produces following VCF file:
##fileformat=VCFv4.0
##source=VarScan2
##INFO=<ID=DP,Number=1,Type=Integer,Description="Total Depth">
##FILTER=<ID=str10,Description="Less than 10% or more than 90% of variant supporting reads on one strand">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##FORMAT=<ID=GQ,Number=1,Type=Integer,Description="Genotype Quality">
##FORMAT=<ID=DP,Number=1,Type=Integer,Description="Read Depth">
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT Sample1
1 565286 C T . PASS DP=13 GT:GQ

1 569492 T C . PASS DP=45 GT:GQ

There are two problems:
i) First of all ID column is empty, and I mean that even though header line has "ID" the following lines do not contain that at all => messes up mandatory 8 fields of VCF file => does not work in combination with Annovar (convert2annovar.pl). I managed to get convert2annovar.pl to accept the vcf file from Varscan only by running it through following (basicly add "null" to each row of ID column):
perl -nle 'if ($_ !~ /^#/){@tmp=split(/\t/,$_);$,="\t";print @tmp[0..1],"null",@tmp[2..(scalar(@tmp)-1)]} else {print}' result.vcf > result_fixed.vcf
Does anybody know some other solution or reason for this output?
ii) QUAL column contains dot for every variant. Shouldn't there be a number? (This doesn't seem bother Annovar, but it is potentially problematic bug(?)
Comment