Seqanswers Leaderboard Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • obk
    Member
    • Dec 2012
    • 12

    Why not variant call before pileup?

    (I'm very new to this particular field.)

    From what I've read, it seems all variant call workflows do: alignment -> pileup -> variant call -> filtering/etc, regardless of whether the data is from whole genome or exome sequencing, and I understand this approach is valid for some applications/diseases. But for cancer applications where a tumor (or tumors) may be heterogeneous and have multiple mutation profiles (e.g. within an exon), is it not more valid to do variant calls on each read (cluster), then do a 'pileup' on the calls?

    For example:

    Code:
    ref: ...AACGTG...
    
         ...AACGTG... 800x *clusters* had this sequence
         ...AACGAG... 100x *clusters* had this sequence
         ...ATCGTG... 100x *clusters* had this sequence
    
    The above data (assume 100% confidence in base call) will be concluded as:
         ...AACGTG... 90% wildtype
         ...ATCGAG... 10% mutant with two mutations
    	 
    ... when, in fact, it is two separate mutations at 10% each.
    If anyone can point me to any papers/etc that discuss this, it is much appreciated.
    Last edited by obk; 01-23-2014, 03:40 PM.
  • TiborNagy
    Senior Member
    • Mar 2010
    • 329

    #2
    No, because if you call variants in a single read, you can not distinguish read errors and real variations.

    Comment

    • SNPsaurus
      Registered Vendor
      • May 2013
      • 525

      #3
      I did see a talk at PAG XXII where the person called variants from a pileup and then went back to the individual reads to fit the variants into haplotypes enforced by the reads. Of course, I can't recall the talk, or if it was even a new thing! But that would give you the results you want.
      Providing nextRAD genotyping and PacBio sequencing services. http://snpsaurus.com

      Comment

      • obk
        Member
        • Dec 2012
        • 12

        #4
        Dear TiborNagy
        I understand I wouldn't want to make a call based on a single read, but in the simple example, you'd call the two mutations 100 times each, which I think would give me some confidence that they are not erroneous reads... I think what I'm wondering is: if you have enough confidence in the base calling technology (or have enough coverage per unique read (like in the example)), what is the difference between:
        1) pileup reads to get consensus read -> variant call -> filter: is it real? -> real SNVs
        2) variant call individual reads -> 'pileup' variant calls -> filter: is it real? -> real SNVs
        (this question may be specific to amplicon sequencing...)
        Thanks.

        Comment

        • SNPsaurus
          Registered Vendor
          • May 2013
          • 525

          #5
          You can do that. When we do genotyping of populations, we get reads along the lines of what describe (mixed haplotypes). So one way we look at it is to align reads, track the variants of each read, then filter. The one difference is that our reads are all in synch (a stack of 100 reads at position 100,000, then a stack of 100 reads at position 200,000, etc). You would have some reads that end in between variants, leading to a little more work interpreting that.
          Providing nextRAD genotyping and PacBio sequencing services. http://snpsaurus.com

          Comment

          • obk
            Member
            • Dec 2012
            • 12

            #6
            Thanks for your comments SNPsaurus.
            Do you have strategies to do any quantitative analysis based on the stack of reads? If it's amplicon sequencing that you're doing, then I imagine it is difficult to account for PCR duplicates.

            Comment

            • SNPsaurus
              Registered Vendor
              • May 2013
              • 525

              #7
              PCR duplicates are an issue, since we can't use different start and stop locations as a way to distinguish independent events. We were mostly looking for the presence of haplotypes in the populations so the precise level wasn't a concern. I was impressed by the "call from pileup then use the reads for phasing" approach I saw at the meeting because it did allow the use of common pipelines up until the last step and I think using common tools is increasingly important.
              Providing nextRAD genotyping and PacBio sequencing services. http://snpsaurus.com

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Pathogen Surveillance with Advanced Genomic Tools
                by seqadmin




                The COVID-19 pandemic highlighted the need for proactive pathogen surveillance systems. As ongoing threats like avian influenza and newly emerging infections continue to pose risks, researchers are working to improve how quickly and accurately pathogens can be identified and tracked. In a recent SEQanswers webinar, two experts discussed how next-generation sequencing (NGS) and machine learning are shaping efforts to monitor viral variation and trace the origins of infectious...
                03-24-2025, 11:48 AM
              • seqadmin
                New Genomics Tools and Methods Shared at AGBT 2025
                by seqadmin


                This year’s Advances in Genome Biology and Technology (AGBT) General Meeting commemorated the 25th anniversary of the event at its original venue on Marco Island, Florida. While this year’s event didn’t include high-profile musical performances, the industry announcements and cutting-edge research still drew the attention of leading scientists.

                The Headliner
                The biggest announcement was Roche stepping back into the sequencing platform market. In the years since...
                03-03-2025, 01:39 PM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, 03-20-2025, 05:03 AM
              0 responses
              49 views
              0 reactions
              Last Post seqadmin  
              Started by seqadmin, 03-19-2025, 07:27 AM
              0 responses
              57 views
              0 reactions
              Last Post seqadmin  
              Started by seqadmin, 03-18-2025, 12:50 PM
              0 responses
              49 views
              0 reactions
              Last Post seqadmin  
              Started by seqadmin, 03-03-2025, 01:15 PM
              0 responses
              200 views
              0 reactions
              Last Post seqadmin  
              Working...