No announcement yet.
  • Filter
  • Time
  • Show
Clear All
new posts

  • Help with Varscan somatic bug report and interpret mpileup2cns result

    I'm facing the following bug report while running varscan somatic
    The bug report shown as below:
    Bug report:
    # A fatal error has been detected by the Java Runtime Environment:
    #  SIGSEGV (0xb) at pc=0x00007f4a3bf04fe8, pid=21559, tid=139956786239248
    # JRE version: 6.0_17-b17
    # Java VM: OpenJDK 64-Bit Server VM (14.0-b16 mixed mode linux-amd64 )
    # Derivative: IcedTea6 1.7.4
    # Distribution: Custom build (Thu Jul 29 16:49:18 EDT 2010)
    # Problematic frame:
    # V  []
    # If you would like to submit a bug report, please include
    # instructions how to reproduce the bug and visit:
    ---------------  T H R E A D  ---------------
    Current thread (0x00007f4a34012000):  GCTaskThread [stack: 0x00007f4a3a771000,0x00007f4a3a872000] [id=21561]
    siginfo:si_signo=SIGSEGV: si_errno=0, si_code=128 (), si_addr=0x0000000000000000
    7fff101ff000-7fff10200000 r-xp 00000000 00:00 0                          [vdso]
    ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0                  [vsyscall]
    VM Arguments:
    java_command: java -jar VarScan.jar somatic normal_tissue.mpileup infected_tissue.mpileup normal_infected_comparison --mpileup 1 --min-var-freq 0.08 --p-value 0.10 --somatic-p-value 0.05 --output-vcf 1
    Launcher Type: SUN_STANDARD
    Environment Variables:
    log file:
    [b]Min coverage:	8x for Normal, 6x for Tumor
    Min reads2:	2
    Min strands2:	1
    Min var freq:	0.08
    Min freq for hom:	0.75
    Normal purity:	1.0
    Tumor purity:	1.0
    Min avg qual:	15
    P-value thresh:
    Somatic p-value:	0.05
    Reading input from normal_tissue.mpileup
    Reading mpileup input...
    Parsing Exception on line:
    normal_tissue_seq1_630	286	A	40	^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.	?@CCCCCB@@C<@CC@@?CCCCC@@@@CCCCC<@@C@@<@
    The command I run is shown as:
    samtools mpileup -f reference.fasta normal.bam > normal_tissue.mpileup
    samtools mpileup -f reference.fasta infected.bam > infected_tissue.mpileup
    java -jar VarScan.jar somatic normal_tissue.mpileup infected_tissue.mpileup normal_infected_comparison --mpileup 1 --min-var-freq 0.08 --p-value 0.10 --somatic-p-value 0.05 --output-vcf 1

    Apart from that, below is the output result after running the command:
    samtools mpileup -f reference.fasta normalA.bam infectedA.bam normalB.bam infectedB.bam | java -jar VarScan.jar mpileup2cns --min-var-freq 0.08 --p-value 0.05 --output-vcf 1 >cross-sample.varScan.vcf
    ##FORMAT=<ID=ADR,Number=1, Type=Integer,Description=" Depth of variant-supporting bases on reverse strand (reads2minus)">
    #CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT  Sample1 Sample2 Sample3 Sample4
    normal_tissue_seq1_630     101     .       A       .       .       PASS    ADP=0;WT=0;HET=0;HOM=0;NC=4     GT:GQ:SDP:DP:RD:AD:FREQ:PVAL: RBQ:ABQ:RDF:RDR:ADF:ADR    ./.:.:0 ./.:.:1 ./.:.:0 ./.:.:0
    normal_tissue_seq5_580      532     .       A       .       .       PASS    ADP=1548;WT=4;HET=0;HOM=0;NC=0  GT:GQ:SDP:DP:RD:AD:FREQ:PVAL:RBQ:ABQ:RDF:RDR:ADF:ADR    0/0:2147483647:1957:1820:1817:2:0.11%:5E-1:33:23:923:894:0:2    0/0:2147483647:1987:1894:1893:1:0.05%:7.5007E-1:34:17:1189:704:0:1
    normal_tissue_seq10_950      533     .       C       T       .       PASS    ADP=1611;WT=3;HET=1;HOM=0;NC=0  GT:GQ:SDP:DP:RD:AD:FREQ:PVAL:RBQ:ABQ:RDF:RDR:ADF:ADR    0/0:303:1969:1843:1820:23:1.25%:1.3987E-6:33:24:880:940:4:19    0/0:2147483647:1981:1916:1908:8:0.42%:1.9421E-2:35:23:1162:746:2:6
    I not sure how to interpret the output result of mpileup2cns
    Thanks for any advice.

  • #2
    Which version of Varscan are you using?
    I never noticed this option --mpileup. What is it for?


    • #3
      I used the latest version of Varscan.
      The mpileup is replaced the pileup right now.
      I able to run Varscan right now.
      The above error is due to the problem of my java version

      Apart from that, below is one of the output result after running VarScan somatic:
      read9786_577      111     .       G       A       .       PASS    DP=951;SS=3;SSC=32;GPV=1E0;SPV=5.8927E-4        GT:GQ:DP:RD:AD:FREQ:DP4 0/1:.:8:4:2:33.33%:3,1,1,1      0/0:.:943:859:4:0.46%:611,248,2,2
      As I know, 8 is refer to data depth, 4 is refer to total number of reference read and 2 is refer to total number of allele read.
      Just wondering why the sum of total number of reference read and total number of allele read is less than total data depth?
      Is it due to the quality score of bases that consider good quality bases just only 6 bases?
      The above output result pattern is looked quite frequent at my data set.

      Apart from that, do you mind to share more or perhaps just provided me some simple example regarding how to interpret genotype in the output result?
      As I know, 0/0 = homozygote reference, 1/1 homozygote alternate, 0/1 is heterozygous and -/- is no call.
      But I just a bit blur to distinguish 3 of the above cases, especially "1/1"


      • #4

        I'm glad you figured out the Java JRE issue behind that exception. As for your second question, the differences in read depth are because of the minimum base quality requirement. DP reflects the SAMtools depth (no base quality requirement), but RD/AD are VarScan's readcounts (by default, qual>15).

        I'm confused by your question about the genotype... its interpretation is spelled out quite clearly in the VCF specification. In your example:

        Sample 1 is 0/1, or heterozygous-variant, with genotype GA.
        Sample 2 is 0/0, or wildtype, with genotype GG.

        If there were a third sample that was 1/1, its genotype would be AA.


        • #5
          Hi Edge,

          In relation to the genotypes, I am using VarScan v2.3.6. I found several lines in which the genotypes are marked as 1/1 while both samples are equal to the reference (0/0).

          Here few examples:
          chr1 721668 . C . PASS DP=168;SS=0;SSC=0;GPV=1E0;SPV=1E0 GT:GQP:RD:AD:FREQP4 1/1:.:78:78:0:0%:34,44,0,0 1/1:.:90:90:0:0%:38,52,0,0

          REFERENCE: chr1 721687 . C . PASS DP=139;SS=0;SSC=0;GPV=1E0;SPV=1E0 GT:GQP:RD:AD:FREQP4 1/1:.:71:71:0:0%:22,49,0,0 1/1:.:68:67:0:0%:22,45,0,0

          Do you know why have this genotypes been classified as 1/1?
          Thank you in advance,


          Latest Articles


          • seqadmin
            Advanced Tools Transforming the Field of Cytogenomics
            by seqadmin

            At the intersection of cytogenetics and genomics lies the exciting field of cytogenomics. It focuses on studying chromosomes at a molecular scale, involving techniques that analyze either the whole genome or particular DNA sequences to examine variations in structure and behavior at the chromosomal or subchromosomal level. By integrating cytogenetic techniques with genomic analysis, researchers can effectively investigate chromosomal abnormalities related to diseases, particularly...
            09-26-2023, 06:26 AM
          • seqadmin
            How RNA-Seq is Transforming Cancer Studies
            by seqadmin

            Cancer research has been transformed through numerous molecular techniques, with RNA sequencing (RNA-seq) playing a crucial role in understanding the complexity of the disease. Maša Ivin, Ph.D., Scientific Writer at Lexogen, and Yvonne Goepel Ph.D., Product Manager at Lexogen, remarked that “The high-throughput nature of RNA-seq allows for rapid profiling and deep exploration of the transcriptome.” They emphasized its indispensable role in cancer research, aiding in biomarker...
            09-07-2023, 11:15 PM





          Topics Statistics Last Post
          Started by seqadmin, 09-29-2023, 09:38 AM
          0 responses
          Last Post seqadmin  
          Started by seqadmin, 09-27-2023, 06:57 AM
          0 responses
          Last Post seqadmin  
          Started by seqadmin, 09-26-2023, 07:53 AM
          0 responses
          Last Post seqadmin  
          Started by seqadmin, 09-25-2023, 07:42 AM
          0 responses
          Last Post seqadmin