Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • VarScan pileup2snp VarFreq calculation

    I have a question about how VarFreq is calculated in pileup2snp.

    I’ve got these results:

    Chrom 1
    Position 260300
    Ref G
    Cons R
    Reads1 3
    Reads2 6
    VarFreq 50%
    Strands1 1
    Strands2 2
    Qual1 42
    Qual2 22
    Pvalue 0.004525
    MapQual1 1
    MapQual2 1
    Reads1Plus 3
    Reads1Minus 0
    Reads2Plus 4
    Reads2Minus 2
    VarAllele A


    Shouldn’t VarFreq be #Reads2/(#Reads1 + #Reads2)=3/(3+6)=1/3?

    and at another position I've got:

    Reads1 0
    Reads2 8
    VarFreq 80%

    What am I missing here?

    Also, when I look at the bam file in tablet there are more than 100 reads. Why only 9 reads in output from pileup2snp?

    Kind Regards
    Petter

  • #2
    snp freq calculation

    Can anyone comment or answer my question above? What is the formula for VarFreq? It has to involve some other measurements (e.g quality scores) than just read counts. Otherwise the output is slightly wrong for many of the variants.

    regards Petter

    Comment


    • #3
      I do not use the pileuptosnp command so I cannot help on your VarFreq problem.
      But concerning
      Also, when I look at the bam file in tablet there are more than 100 reads. Why only 9 reads in output from pileup2snp?
      Did you run mpileup with -A and especially -B options to generate your pileups? (you can find several topics about this issue on seqanswers)

      Comment


      • #4
        mpileup2snp

        Thanks Jane,

        The -B and -A option using mpileup helped a lot.

        Still though I can't understand how the varfreq (freq column in mpileup2snp) is calulated? Anyone who can answer this?

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Current Approaches to Protein Sequencing
          by seqadmin


          Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
          04-04-2024, 04:25 PM
        • seqadmin
          Strategies for Sequencing Challenging Samples
          by seqadmin


          Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
          03-22-2024, 06:39 AM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, 04-11-2024, 12:08 PM
        0 responses
        21 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 10:19 PM
        0 responses
        23 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 09:21 AM
        0 responses
        18 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-04-2024, 09:00 AM
        0 responses
        49 views
        0 likes
        Last Post seqadmin  
        Working...
        X