Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • VarScan.v2.3.1 vs VarScan.v2.2.*

    Hello,

    I noticed yesterday the new release of VarScan2. Of course, I tried it on two samples, by changing only the version.
    I wonder what are the changes. I read "VarScan v2.3.1 released with bug fixes", but which ones?
    I read also the change in VCF compatibility. For my problematic, I do not use VCF format, I'm rather interested in the bug fixes.

    I will show you the differences that I got between 2 versions on 2 samples:

    java -Xmx15g -jar VarScan.v2.3.1.jar somatic /data/cd3.mpileup /data/159cd14.mpileup --output-snp /data/VarScan2/varscan.snp --output-indel /data/VarScan2/varscan.indel --min-coverage 10 --min-var-freq 0.2 --min-freq-for-hom 0.75 --normal-purity 0.8 --tumor-purity 1 --p-value 0.05 --somatic-p-value 0.05 --strand-filter 0 --min-avg-qual 20 --min-strands2 0 --min-reads2 0
    134948529 positions in tumor
    134724809 positions shared in normal
    87521347 had sufficient coverage for comparison
    87408822 were called Reference
    178 were mixed SNP-indel calls and filtered
    11041 were removed by the strand filter
    84099 were called Germline
    8338 were called LOH
    8284 were called Somatic
    585 were called Unknown
    0 were called Variant
    java -Xmx15g -jar VarScan.v2.2.10.jar somatic /data/cd3.mpileup /data/159cd14.mpileup --output-snp /data/VarScan2/varscan.snp --output-indel /data/VarScan2/varscan.indel --min-coverage 10 --min-var-freq 0.2 --min-freq-for-hom 0.75 --normal-purity 0.8 --tumor-purity 1 --p-value 0.05 --somatic-p-value 0.05 --strand-filter 0 --min-avg-qual 20 --min-strands2 0 --min-reads2 0
    134948529 positions in tumor
    134724809 positions shared in normal
    87521679 had sufficient coverage for comparison
    87411857 were called Reference
    117 were mixed SNP-indel calls and filtered
    10460 were removed by the strand filter
    82914 were called Germline
    7907 were called LOH
    8066 were called Somatic
    358 were called Unknown
    0 were called Variant
    When I saw that the number of positions with sufficient coverage decreased, I though that the bug mentioned here: seqanswers.com/forums/showthread.php?t=20791 (coverage criteria not satisfied) was solved. Unfortunately, it doesn't seem to be the case: with the new version, I have such results:

    chr1 24334459 A C 6 0 0% A 3 1 25% M Somatic 1.0 0.39999999999999963 0 3 0 1
    Then, I wanted to see if the strand filter was activated, whatever the parameter value, in this new version. As you can see, I didn't ask for the strand filter because I don't want it ; nevertheless, 11041 positions were removed by the strand filter.

    Finally, there is still this issue: tumor_reads1 different from tumor_reads1_plus+tumor_reads1_minus for the INDELs
    chrom position ref var normal_reads1 normal_reads2 normal_var_freq normal_gt tumor_reads1 tumor_reads2 tumor_var_freq tumor_gt somatic_status variant_p_value somatic_p_value tumor_reads1_plus tumor_reads1_minus tumor_reads2_plus tumor_reads2_minus
    chr1 3801133 A -T 12 0 0% A 8 3 27,27% */-T Somatic 1.0 0.09316770186335391 0 11 0 3
    My questions are:
    1. Which bugs have been fixed in this new release?
    2. Do you know a way to avoid the problems I meet? (especially concerning the strand filter, it's a big problem for me - it's easy to handle the coverage criteria)
    3. For Dan Koboldt: do you intend to solve some of these issues? Or do I do something wrong for obtaining such results?


    Thank you in advance for your help and thank you Dan Koboldt for maintaining your tool, which has plenty of advantages, even if I mentioned here issues only
    Jane
    Last edited by Jane M; 08-17-2012, 01:24 AM.

  • #2
    Any ideas? Any suggestions?

    Comment


    • #3
      Jane,

      I'm sorry for my delay in replying - I just came across your post. Thank you for being a longtime and active VarScan user!

      I'm trying to include release notes with new VarScan releases that precisely detail what's been changed. The version you asked about (v2.3.1) included that. As you're aware, there were a number of improvements to VCF compatibility, but there were a few specific bug fixes that might affect your work:
      1.) I corrected a bug in the indel-filtering functionality of the "filter" command.
      2.) I made a global fix for "locale parsing" errors encountered when floating-point numbers are represented with a comma (3,1415926) instead of a period (3.1415926); this occasionally happens in European locales.

      In v2.3.2, which was released shortly afterward, I also corrected an issue with the base-quality parsing in reads containing indels.

      Please try to post to the VarScan Help forum if you encounter future issues, as I get e-mailed immediately when issues are posted there:

      http://sourceforge.net/projects/vars.../forum/1073559

      Comment


      • #4
        Dear Dan,
        Thank you for your answer ! I will use the VarScan Help forum in the future to get answers faster.

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Essential Discoveries and Tools in Epitranscriptomics
          by seqadmin




          The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
          04-22-2024, 07:01 AM
        • seqadmin
          Current Approaches to Protein Sequencing
          by seqadmin


          Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
          04-04-2024, 04:25 PM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, Yesterday, 11:49 AM
        0 responses
        15 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-24-2024, 08:47 AM
        0 responses
        16 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-11-2024, 12:08 PM
        0 responses
        62 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 10:19 PM
        0 responses
        60 views
        0 likes
        Last Post seqadmin  
        Working...
        X