Hi,
I'm using the latest version of varScan
Does the somatic filter support output in .vcf format?
I used the indel and snp .vcf files obtained with somatic command line:
java -Xmx6g -jar VarScan.v2.3.6.jar somaticFilter 34-C-S.varScan.output.snp.vcf --min-strands2 2 --min-avg-qual 25 --min-var-freq 0.3 --p-value 0.05 --min-strands2 2 --min-reads2 3 --indel-file 34-C-S.varScan.output.indel.vcf --output-vcf 1 34-C-S.vcf
The software starts:
Reading input from /Users/mac2/Documents/trasferimento/Napoli_2013/bam_bgi_pileup/34-C-S.varScan.output.snp.vcf
1927 cluster SNPs identified
Reading input from /Users/mac2/Documents/trasferimento/Napoli_2013/bam_bgi_pileup/34-C-S.varScan.output.snp.vcf
97671 variants in input stream
395 failed to meet coverage requirement
969 failed to meet reads2 requirement
5504 failed to meet varfreq requirement
3510 failed to meet p-value requirement
1340 in SNP clusters were removed
329 were removed near indels
85624 passed filters
but no output file has been created
Many thanks,
Francesco
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
Uma,
Thank you for providing the files, which helped me track down the issue. As I'd suspected, the entries in the VCF file for SNVs were slightly more numerous than the native SNV output file.
This is because we output "indelError" calls (positions where normal shows a SNV but tumor shows an indel or vice-versa) to the VCF for the sake of completeness. Their filter status is "indelError" to indicate that these are likely artifactual calls. We don't output them to the native output format for that reason.
If I remove "indelError" positions from your SNP VCF and then apply somaticFilter, the results are identical to running it on the native output files. That being said, you might wish to use the filtering results from the unmodified VCF, because SNVs clustering around "indelError" calls should probably be removed. After that, any "indelError" calls that passed somaticFilter can be removed using grep.
Thank you for your help on this!
Yours,
Dan Koboldt
Leave a comment:
-
Hi Dan,
There is a difference in results and sent you sample data.
Many thanks for your help
Uma
Leave a comment:
-
Just to be sure, I can do an intersect given by their chrom, position between the native and vcf results (after just somatic command line arg).
Uma
Leave a comment:
-
The difference in results (native output Vs VCF output) occurs after the somaticFilter step. In general I use something like the following args -
java -jar ~/tools/VarScan.v2.3.1.jar somaticFilter ./varscan-out/65_varscan13output.snps --min-var-freq 0.5 --indel-file ./varscan-out/65_varscan13output.indel --output ./varscan-out/65_varscan13output.snps.filtered
java -jar ~/tools/VarScan.v2.3.1.jar somaticFilter ./varscan-out/65_varscan13output.snps --min-var-freq 0.5 --indel-file ./varscan-out/65_varscan13output.indel.vcf --output-vcf > ./varscan-out/65_varscan13output.snps.filtered.vcf
On the somatic results (java -jar ~/tools/VarScan.v2.3.1.jar somatic); the native and vcf results are about the same row lines, so I reckon the difference might not occur therein.
Also if it was a parsing error and the above somaticFilter commanline arg is correct, then I would assume native-output should atleast be a subset of vcf-output, but instead I get uniques.
I can send you 1000 lines of the somaticFilter output.
Thanks
Uma
Leave a comment:
-
That's a curious result, and it could reflect an error in the new VCF parsing code. Would you be able to send me your 4 files (SNP and indel, original and VCF) or at least the first 1,000 lines or so? Send it to dkoboldt (at) genome [dot] wustl (dot) edu.
Thanks,
Dan Koboldt
Leave a comment:
-
varscan somaticFilterResults
Hi
While using Varscan (V.2.3.1); I have used default options but get different results (for the same dataset) using indel.vcf filters (on snp.vcf) and simply using indelvcf filter (on snp dataset). In theory, this is just a file format variation and the stats should remain the same. However I see a difference due to p-value.
time java -jar ~/tools/VarScan.v2.3.1.jar somaticFilter 65_varscan-out.snp -min-var-freq 0.5 --indel-file 65_varscan-out.indel --output-file 65-filtered
Window size: 10
Window SNPs: 3
Indel margin: 3
Reading input from 65_varscan-out.snp
2962 cluster SNPs identified
Reading input from 65_varscan-out.snp
88168 variants in input stream
13612 failed to meet coverage requirement
5579 failed to meet reads2 requirement
24230 failed to meet varfreq requirement
40748 failed to meet p-value requirement
45 in SNP clusters were removed
1 were removed near indels
3953 passed filters
real 0m3.532s
user 0m2.268s
sys 0m0.188s
time java -jar ~/tools/VarScan.v2.3.1.jar somaticFilter 65_varscan-out.snp.vcf -min-var-freq 0.5 --indel-file 65_varscan-out.indel.vcf --output-file 65-filtered.vcf
Window size: 10
Window SNPs: 3
Indel margin: 3
Reading input from 65_varscan-out.snp.vcf
2972 cluster SNPs identified
Reading input from 65_varscan-out.snp.vcf
88177 variants in input stream
13615 failed to meet coverage requirement
5583 failed to meet reads2 requirement
24231 failed to meet varfreq requirement
2862 failed to meet p-value requirement
367 in SNP clusters were removed
39 were removed near indels
41480 passed filters
real 0m1.506s
user 0m2.504s
sys 0m0.272s
Any particular reason for these differences ? Please note that on a quick comparison between *.vcf files and its corresponding snp and indel files, there are no differences when compared by its chr and position.
Many Thanks
Latest Articles
Collapse
-
by seqadmin
This year’s Advances in Genome Biology and Technology (AGBT) General Meeting commemorated the 25th anniversary of the event at its original venue on Marco Island, Florida. While this year’s event didn’t include high-profile musical performances, the industry announcements and cutting-edge research still drew the attention of leading scientists.
The Headliner
The biggest announcement was Roche stepping back into the sequencing platform market. In the years since...-
Channel: Articles
03-03-2025, 01:39 PM -
-
by seqadmin
The human gut contains trillions of microorganisms that impact digestion, immune functions, and overall health1. Despite major breakthroughs, we’re only beginning to understand the full extent of the microbiome’s influence on health and disease. Advances in next-generation sequencing and spatial biology have opened new windows into this complex environment, yet many questions remain. This article highlights two recent studies exploring how diet influences microbial...-
Channel: Articles
02-24-2025, 06:31 AM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, 03-03-2025, 01:15 PM
|
0 responses
180 views
0 likes
|
Last Post
by seqadmin
03-03-2025, 01:15 PM
|
||
Started by seqadmin, 02-28-2025, 12:58 PM
|
0 responses
275 views
0 likes
|
Last Post
by seqadmin
02-28-2025, 12:58 PM
|
||
Started by seqadmin, 02-24-2025, 02:48 PM
|
0 responses
663 views
0 likes
|
Last Post
by seqadmin
02-24-2025, 02:48 PM
|
||
Started by seqadmin, 02-21-2025, 02:46 PM
|
0 responses
268 views
0 likes
|
Last Post
by seqadmin
02-21-2025, 02:46 PM
|
Leave a comment: