Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • shuang
    Senior Member
    • Jul 2011
    • 100

    SNP base calling

    My SNP data is from Sanger sequencing. Multiple samples cover varied regions, not necessary the same fragments. I performed alignment via bwasw and pileup via samtools.

    What would be the differences between
    1. pileup multiple samples all together
    2. pileup one sample at a time?


    Also, would QUAL score, DP, AC be affected dramatically?
  • volks
    Member
    • Jun 2010
    • 80

    #2
    mpileup doesnt give you an output at positions where there is no coverage. so if you want to compare different samples it might be more convenient to generate the pileup together.

    Comment

    • shuang
      Senior Member
      • Jul 2011
      • 100

      #3
      I actually do not need to know the position when a sample doesn't cover that base.

      Other than that, any else would make differences?

      Comment

      • swbarnes2
        Senior Member
        • May 2008
        • 910

        #4
        If a SNP is called in one sample, and not another, it is helpful to look at the other sample, to determine if that other sample really is wt, or if coverage was just too low for it to make the same SNP call, and doing mpileup together helps for that. Unfortunately, what's really helpful is the DP4 values, and mpileup combines them all, which can make it harder to assess the likelihood of each sample. Yes, you get a GQ, but the coverage is helpful as well.

        Comment

        • shuang
          Senior Member
          • Jul 2011
          • 100

          #5
          I also notice that QUAL score tends to be much lower in one sample analysis than multiple samples analysis. Why is that way?

          Comment

          • lh3
            Senior Member
            • Feb 2008
            • 686

            #6
            If you want to compare between samples, always pool samples together (i.e. generate mpileup across all samples). Mpileup skips sites where there is no coverage across ALL samples. On the other hand, swbarnes2 has the point that DP4 is combined. If you need that information, you may pool samples at first to find sites you are interested in and then run single-sample pileup to get DP4.

            Comment

            • shuang
              Senior Member
              • Jul 2011
              • 100

              #7
              When I tried to pileup multiple samples together, even a sequence/read did not cover the SNP base was shown as het (1/0). That confused our conclusion.

              How do I avoid that? Or how do I tell a het means a real one or means no-coverage?

              Also, how do I use DP4 value?

              Comment

              • aslihan
                Member
                • Jun 2011
                • 23

                #8
                How to use dp4 values ??

                How to use dp4 values ??

                Comment

                Latest Articles

                Collapse

                • SEQadmin2
                  Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                  by SEQadmin2


                  I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.


                  Here are nine questions we think about, in roughly the order they matter, before...
                  06-18-2026, 07:11 AM
                • SEQadmin2
                  From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                  by SEQadmin2


                  Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                  The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                  ...
                  06-02-2026, 10:05 AM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by SEQadmin2, 06-17-2026, 06:09 AM
                0 responses
                32 views
                0 reactions
                Last Post SEQadmin2  
                Started by SEQadmin2, 06-09-2026, 11:58 AM
                0 responses
                97 views
                0 reactions
                Last Post SEQadmin2  
                Started by SEQadmin2, 06-05-2026, 10:09 AM
                0 responses
                117 views
                0 reactions
                Last Post SEQadmin2  
                Started by SEQadmin2, 06-04-2026, 08:59 AM
                0 responses
                109 views
                0 reactions
                Last Post SEQadmin2  
                Working...