Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Varscan v2.3.5 headers missing

    I am running varscan to call variants on a single file set. I have run it twice now and the header files are missing on the output vcf files. All other samples I analyzed with varscan this week (11), have the header files. What might cause the header to not be included?

    I used bwa to align my illumina fastq files.
    I am using samtools v 1.1.19 for the mpileup.
    My command line argument:
    samtools mpileup -q 1 -C50 -DSf /ref/human_g1k_v37.fasta /myData/S12_sorted.bam | java -jar VarScan.v2.3.5.jar mpileup2snp --min-coverage 8 --min-reads2 2 --min-var-freq 0.01 --min-avg-qual 15 --p-value 0.01 --strand-filter 0 --output-vcf 1 --variants 0 > /myData/S12_snp.vcf

    Here is the header on my .bam file:
    -bash-3.2$ samtools view -H /myData/S12_sorted.bam
    @SQ SN:1 LN:249250621
    @SQ SN:2 LN:243199373
    @SQ SN:3 LN:198022430
    @SQ SN:4 LN:191154276
    @SQ SN:5 LN:180915260
    @SQ SN:6 LN:171115067
    @SQ SN:7 LN:159138663
    @SQ SN:8 LN:146364022
    @SQ SN:9 LN:141213431
    @SQ SN:10 LN:135534747
    @SQ SN:11 LN:135006516
    @SQ SN:12 LN:133851895
    @SQ SN:13 LN:115169878
    @SQ SN:14 LN:107349540
    @SQ SN:15 LN:102531392
    @SQ SN:16 LN:90354753
    @SQ SN:17 LN:81195210
    @SQ SN:18 LN:78077248
    @SQ SN:19 LN:59128983
    @SQ SN:20 LN:63025520
    @SQ SN:21 LN:48129895
    @SQ SN:22 LN:51304566
    @SQ SN:X LN:155270560
    @SQ SN:Y LN:59373566
    @SQ SN:MT LN:16569
    @SQ SN:GL000207.1 LN:4262
    @SQ SN:GL000226.1 LN:15008
    @SQ SN:GL000229.1 LN:19913
    @SQ SN:GL000231.1 LN:27386
    @SQ SN:GL000210.1 LN:27682
    @SQ SN:GL000239.1 LN:33824
    @SQ SN:GL000235.1 LN:34474
    @SQ SN:GL000201.1 LN:36148
    @SQ SN:GL000247.1 LN:36422
    @SQ SN:GL000245.1 LN:36651
    @SQ SN:GL000197.1 LN:37175
    @SQ SN:GL000203.1 LN:37498
    @SQ SN:GL000246.1 LN:38154
    @SQ SN:GL000249.1 LN:38502
    @SQ SN:GL000196.1 LN:38914
    @SQ SN:GL000248.1 LN:39786
    @SQ SN:GL000244.1 LN:39929
    @SQ SN:GL000238.1 LN:39939
    @SQ SN:GL000202.1 LN:40103
    @SQ SN:GL000234.1 LN:40531
    @SQ SN:GL000232.1 LN:40652
    @SQ SN:GL000206.1 LN:41001
    @SQ SN:GL000240.1 LN:41933
    @SQ SN:GL000236.1 LN:41934
    @SQ SN:GL000241.1 LN:42152
    @SQ SN:GL000243.1 LN:43341
    @SQ SN:GL000242.1 LN:43523
    @SQ SN:GL000230.1 LN:43691
    @SQ SN:GL000237.1 LN:45867
    @SQ SN:GL000233.1 LN:45941
    @SQ SN:GL000204.1 LN:81310
    @SQ SN:GL000198.1 LN:90085
    @SQ SN:GL000208.1 LN:92689
    @SQ SN:GL000191.1 LN:106433
    @SQ SN:GL000227.1 LN:128374
    @SQ SN:GL000228.1 LN:129120
    @SQ SN:GL000214.1 LN:137718
    @SQ SN:GL000221.1 LN:155397
    @SQ SN:GL000209.1 LN:159169
    @SQ SN:GL000218.1 LN:161147
    @SQ SN:GL000220.1 LN:161802
    @SQ SN:GL000213.1 LN:164239
    @SQ SN:GL000211.1 LN:166566
    @SQ SN:GL000199.1 LN:169874
    @SQ SN:GL000217.1 LN:172149
    @SQ SN:GL000216.1 LN:172294
    @SQ SN:GL000215.1 LN:172545
    @SQ SN:GL000205.1 LN:174588
    @SQ SN:GL000219.1 LN:179198
    @SQ SN:GL000224.1 LN:179693
    @SQ SN:GL000223.1 LN:180455
    @SQ SN:GL000195.1 LN:182896
    @SQ SN:GL000212.1 LN:186858
    @SQ SN:GL000222.1 LN:186861
    @SQ SN:GL000200.1 LN:187035
    @SQ SN:GL000193.1 LN:189789
    @SQ SN:GL000194.1 LN:191469
    @SQ SN:GL000225.1 LN:211173
    @SQ SN:GL000192.1 LN:547496
    @RG ID:work SM:S12 PL:Illumina PU:S12
    @PG ID:bwa PN:bwa VN:0.5.9-r16

    Thanks for any advice.

  • #2
    Hello, and thanks for posting. Do you mean that the VCF header lines are missing? That's strange behavior and I'm happy to help you investigate.

    Would you mind letting me know what your output file (/myData/S12_snp.vcf) looked like?

    Comment


    • #3
      Thanks for your help. Here is what the first 10 lines look like:
      1 10061 . T G . PASS ADP=268;WT=0;HET=1;HOM=0;NC=0 GT:GQ:SDPP:RD:AD:FREQ:PVAL:RBQ:ABQ:RDF:RDR:ADF:ADR 0/1:33:270:268:250:11:4.12%:4.385E-4:33:29:210:40:4:7
      1 10067 . T G . PASS ADP=270;WT=0;HET=1;HOM=0;NC=0 GT:GQ:SDPP:RD:AD:FREQ:PVAL:RBQ:ABQ:RDF:RDR:ADF:ADR 0/1:24:274:270:260:8:2.96%:3.7047E-3:33:31:207:53:7:1
      1 10079 . T G . PASS ADP=266;WT=0;HET=1;HOM=0;NC=0 GT:GQ:SDPP:RD:AD:FREQ:PVAL:RBQ:ABQ:RDF:RDR:ADF:ADR 0/1:33:268:266:254:11:4.14%:4.3923E-4:32:34:196:58:11:0
      1 10083 . C G . PASS ADP=284;WT=0;HET=1;HOM=0;NC=0 GT:GQ:SDPP:RD:AD:FREQ:PVAL:RBQ:ABQ:RDF:RDR:ADF:ADR 0/1:71:284:284:261:23:8.1%:7.496E-8:34:32:188:73:23:0
      1 10097 . T G . PASS ADP=236;WT=0;HET=1;HOM=0;NC=0 GT:GQ:SDPP:RD:AD:FREQ:PVAL:RBQ:ABQ:RDF:RDR:ADF:ADR 0/1:55:240:236:213:18:7.66%:2.7036E-6:30:29:149:64:17:1
      1 10108 . C T . PASS ADP=188;WT=0;HET=1;HOM=0;NC=0 GT:GQ:SDPP:RD:AD:FREQ:PVAL:RBQ:ABQ:RDF:RDR:ADF:ADR 0/1:49:189:188:170:16:8.51%:1.0897E-5:32:33:111:59:4:12
      1 10109 . A T . PASS ADP=183;WT=0;HET=1;HOM=0;NC=0 GT:GQ:SDPP:RD:AD:FREQ:PVAL:RBQ:ABQ:RDF:RDR:ADF:ADR 0/1:145:184:183:139:44:24.04%:2.9982E-15:31:30:81:58:29:15
      1 10114 . T G . PASS ADP=172;WT=0;HET=1;HOM=0;NC=0 GT:GQ:SDPP:RD:AD:FREQ:PVAL:RBQ:ABQ:RDF:RDR:ADF:ADR 0/1:37:173:172:155:12:7.14%:1.9895E-4:31:25:88:67:5:7
      1 10147 . C A . PASS ADP=49;WT=0;HET=1;HOM=0;NC=0 GT:GQ:SDPP:RD:AD:FREQ:PVAL:RBQ:ABQ:RDF:RDR:ADF:ADR 0/1:39:49:49:37:12:24.49%:1.1358E-4:33:27:12:25:11:1
      1 10177 . A C . PASS ADP=39;WT=0;HET=1;HOM=0;NC=0 GT:GQ:SDPP:RD:AD:FREQ:PVAL:RBQ:ABQ:RDF:RDR:ADF:ADR 0/1:33:39:39:20:10:25.64%:3.9851E-4:27:32:4:16:4:6

      Comment


      • #4
        My apologies I forgot to disable the smilies in text:
        1 10061 . T G . PASS ADP=268;WT=0;HET=1;HOM=0;NC=0 GT:GQ:SDPP:RD:AD:FREQ:PVAL:RBQ:ABQ:RDF:RDR:ADF:ADR 0/1:33:270:268:250:11:4.12%:4.385E-4:33:29:210:40:4:7
        1 10067 . T G . PASS ADP=270;WT=0;HET=1;HOM=0;NC=0 GT:GQ:SDPP:RD:AD:FREQ:PVAL:RBQ:ABQ:RDF:RDR:ADF:ADR 0/1:24:274:270:260:8:2.96%:3.7047E-3:33:31:207:53:7:1
        1 10079 . T G . PASS ADP=266;WT=0;HET=1;HOM=0;NC=0 GT:GQ:SDPP:RD:AD:FREQ:PVAL:RBQ:ABQ:RDF:RDR:ADF:ADR 0/1:33:268:266:254:11:4.14%:4.3923E-4:32:34:196:58:11:0
        1 10083 . C G . PASS ADP=284;WT=0;HET=1;HOM=0;NC=0 GT:GQ:SDPP:RD:AD:FREQ:PVAL:RBQ:ABQ:RDF:RDR:ADF:ADR 0/1:71:284:284:261:23:8.1%:7.496E-8:34:32:188:73:23:0
        1 10097 . T G . PASS ADP=236;WT=0;HET=1;HOM=0;NC=0 GT:GQ:SDPP:RD:AD:FREQ:PVAL:RBQ:ABQ:RDF:RDR:ADF:ADR 0/1:55:240:236:213:18:7.66%:2.7036E-6:30:29:149:64:17:1
        1 10108 . C T . PASS ADP=188;WT=0;HET=1;HOM=0;NC=0 GT:GQ:SDPP:RD:AD:FREQ:PVAL:RBQ:ABQ:RDF:RDR:ADF:ADR 0/1:49:189:188:170:16:8.51%:1.0897E-5:32:33:111:59:4:12
        1 10109 . A T . PASS ADP=183;WT=0;HET=1;HOM=0;NC=0 GT:GQ:SDPP:RD:AD:FREQ:PVAL:RBQ:ABQ:RDF:RDR:ADF:ADR 0/1:145:184:183:139:44:24.04%:2.9982E-15:31:30:81:58:29:15
        1 10114 . T G . PASS ADP=172;WT=0;HET=1;HOM=0;NC=0 GT:GQ:SDPP:RD:AD:FREQ:PVAL:RBQ:ABQ:RDF:RDR:ADF:ADR 0/1:37:173:172:155:12:7.14%:1.9895E-4:31:25:88:67:5:7
        1 10147 . C A . PASS ADP=49;WT=0;HET=1;HOM=0;NC=0 GT:GQ:SDPP:RD:AD:FREQ:PVAL:RBQ:ABQ:RDF:RDR:ADF:ADR 0/1:39:49:49:37:12:24.49%:1.1358E-4:33:27:12:25:11:1
        1 10177 . A C . PASS ADP=39;WT=0;HET=1;HOM=0;NC=0 GT:GQ:SDPP:RD:AD:FREQ:PVAL:RBQ:ABQ:RDF:RDR:ADF:ADR 0/1:33:39:39:20:10:25.64%:3.9851E-4:27:32:4:16:4:6

        Comment


        • #5
          Should I provide a different type of data in order to trouble shoot? Thanks

          Comment


          • #6
            Hi, I am having a similar problem with VarScan 2.3.6. Is there any workaround for this? I can provide sample input/output if you need it.

            Comment


            • #7
              Anyone have fixed this problem? I also meet this issue, VCF header is missing in some, while is right in others. I don't know why could this happen.

              Comment


              • #8
                I have not resolved this issue yet and am working on it today. I will let you know if I find anything.

                Comment


                • #9
                  Has anyone else had this issue. I am testing this software again. I've run the same exome sample with three different aligners. All of the bam files have headers. Only 2 of the 6 analyses I've run were output with headers. I'm still stumped as to why this is occurring. Thanks.

                  Comment


                  • #10
                    Yep. Me too. Exactly the same situation: illumina fastq to bam via bwa; samtools mpileup to vcf via varscan --output-vcf. Resulting vcf has no header.
                    Latest Ubuntu, latest samtools, latest varscan as of March 2015.

                    Comment


                    • #11
                      I still have not resolved this issue. Please keep us posted if you find a solution. Thank you

                      Comment


                      • #12
                        The problem is that --output-vcf 1 doesn't work for pileup2snp, it only works for mpileup2snp. Lack of documentation for this flag, plus lack of error checking for command line args, makes it hard to figure this out.

                        Comment

                        Latest Articles

                        Collapse

                        • seqadmin
                          Essential Discoveries and Tools in Epitranscriptomics
                          by seqadmin




                          The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                          Yesterday, 07:01 AM
                        • seqadmin
                          Current Approaches to Protein Sequencing
                          by seqadmin


                          Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                          04-04-2024, 04:25 PM

                        ad_right_rmr

                        Collapse

                        News

                        Collapse

                        Topics Statistics Last Post
                        Started by seqadmin, 04-11-2024, 12:08 PM
                        0 responses
                        55 views
                        0 likes
                        Last Post seqadmin  
                        Started by seqadmin, 04-10-2024, 10:19 PM
                        0 responses
                        52 views
                        0 likes
                        Last Post seqadmin  
                        Started by seqadmin, 04-10-2024, 09:21 AM
                        0 responses
                        45 views
                        0 likes
                        Last Post seqadmin  
                        Started by seqadmin, 04-04-2024, 09:00 AM
                        0 responses
                        55 views
                        0 likes
                        Last Post seqadmin  
                        Working...
                        X