Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • VariantRecalibrator input file error

    Hi everyone,
    I'm having some trouble with the variant quality score recalibrator.
    I have the following error message.

    MESSAGE: Bad input: Values for HaplotypeScore annotation not detected for ANY training variant in the input callset. VariantAnnotator may be used to add these annotations.

    I've used the following command line:
    java -jar GenomeAnalysisTK.jar
    -T VariantRecalibrator
    -resource: hapmap,known=false,training=true,truth=true,prior=15.0 hapmap_3.3.b37.sites.vcf
    -resource: omni,known=false,training=true,truth=false,prior=12.0 1000G_omni2.5.b37.sites.vcf
    -resource: dbsnp,known=true,training=false,truth=false,prior=8.0 dbsnp_132.b37.vcf
    -an HaplotypeScore
    -nt 3
    -input realigned.vcf
    -R chr01.fa
    -recalFile output.recal
    -tranchesFile output.tranches
    -rscriptFile output.plots.R

    The realigned.vcf file looks like

    #CHROM POS ID REF ALT QUAL FILTER INFO FORMAT Campione
    1 565006 . T C 10.43 LowQual AC=2;AF=1.00;AN=2;DP=1;Dels=0.00;HRun=2;HaplotypeScore=0.0000;MQ=60.00;MQ0=0;QD=10.43 GT:AD: DP:GQ:PL 1/1:0,1:1:3.01:40,3,0
    1 566933 . A G 11.01 LowQual AC=2;AF=1.00;AN=2;DP=1;Dels=0.00;HRun=2;HaplotypeScore=0.0000;MQ=60.00;MQ0=0;QD=11.01 GT:AD: DP:GQ:PL 1/1:0,1:1:3.01:41,3,0
    1 566960 . T C 11.01 LowQual AC=2;AF=1.00;AN=2;DP=1;Dels=0.00;HRun=0;HaplotypeScore=0.0000;MQ=60.00;MQ0=0;QD=11.01 GT:AD: DP:GQ:PL 1/1:0,1:1:3.01:41,3,0
    1 567002 . T C 68.76 PASS AC=2;AF=1.00;AN=2;DP=3;Dels=0.00;FS=0.000;HRun=0;HaplotypeScore=0.0000;MQ=60.00;MQ0=0;QD=22.92 GT:AD: DP:GQ:PL 1/1:0,3:3:9.01:101,9,0
    1 567033 . T C 38.66 PASS AC=2;AF=1.00;AN=2;DP=2;Dels=0.00;FS=0.000;HRun=1;HaplotypeScore=0.0000;MQ=60.00;MQ0=0;QD=19.33 GT:AD: DP:GQ:PL 1/1:0,2:2:6.02:70,6,0

    It is already annoted.
    So where is the problem?
    I have to annotate the "resource" files? I downloaded these callset from the GATK resource boundle. Aren't they already annoted?

  • #2
    Hi there,
    yeah i have the same error and a lot of other people as well
    Now then it seems that the VCF file which we pass to the command should have in their INFO lines these strings "QD" "HaplotypeScore" "MQRankSum" "ReadPosRankSum" "FS" "MQ". In your case only HaplotypeScore of course. The nice thing is that the VCF provided by GATK and the ones which supposedly we should pass do not contain that info at all

    For example in your case you pass:
    -resource: hapmap,known=false,training=true,truth=true,prior=15.0 hapmap_3.3.b37.sites.vcf

    so this file should have "HaplotypeScore" in the INFO section which im sure it doesnt. Not in mine VCF.

    So in my case i pass these three VCFs and they should contain the corresponding Key strings but they do not:
    dbsnp_135.hg19.vcf
    hapmap_3.3.hg19.sites.vcf
    1000G_omni2.5.hg19.sites.vcf

    The only VCF file which contain all of the annotations is:
    NA12878.HiSeq.WGS.bwa.cleaned.raw.subset.hg19.vcf

    But then how to specify that want to use that one? For the other three there are the strings: hapmap, omni and dbsnp. But what about NA12878? No idea.

    So im stuck on that stupid part

    Any help is appreciated
    Thank you

    Comment

    Latest Articles

    Collapse

    • seqadmin
      Recent Advances in Sequencing Analysis Tools
      by seqadmin


      The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
      05-06-2024, 07:48 AM
    • seqadmin
      Essential Discoveries and Tools in Epitranscriptomics
      by seqadmin




      The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
      04-22-2024, 07:01 AM

    ad_right_rmr

    Collapse

    News

    Collapse

    Topics Statistics Last Post
    Started by seqadmin, 05-10-2024, 06:35 AM
    0 responses
    20 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 05-09-2024, 02:46 PM
    0 responses
    25 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 05-07-2024, 06:57 AM
    0 responses
    21 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 05-06-2024, 07:17 AM
    0 responses
    21 views
    0 likes
    Last Post seqadmin  
    Working...
    X