Seqanswers Leaderboard Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • kweber2
    Member
    • Aug 2010
    • 10

    Getting map qualities with samtools mpileup?

    I've been using the -s option of samtools "pileup" to output the mapping qualities for each read at each position as a last column in the pileup file.

    I know that the "pileup" command is deprecated, and I need to switch to the "mpileup" command, but "mpileup" doesn't seem to have an option for including the map qualities. Does anyone have any advice? I know that the map quality is still reported once at the beginning of each read, but it seems like it will be tricky to map that quality to the correct base in each subsequent position. I'm hoping there's an easier alternative...

    Thanks for any help!

    Kris Weber
    Research Assistant
    Noble Lab, Department of Genome Sciences
    University of Washington
  • FCAlive
    Junior Member
    • Jun 2011
    • 2

    #2
    I have the same question

    I have the same question.

    bump

    Comment

    • kweber2
      Member
      • Aug 2010
      • 10

      #3
      And I still have the same question too, if anyone has any answers...

      Comment

      • FCAlive
        Junior Member
        • Jun 2011
        • 2

        #4
        Bumpity bump bump bump bump

        Bumpity bump bump bump bump

        Comment

        • earonesty
          Member
          • Mar 2011
          • 52

          #5
          mpileup, pileup, Bio:B::Sam

          mpileup is *supposed* to handle all the problems with mapping quality internally with the BAQ algorithm. I think the resulting entries in the pileup have been quality-adjusted so that the phred scores should reflect the BAQ-adjusted values, right? Thus mpileup's output should be "usable without needing the mapping quality".

          Am I reading the mpileup help page incorrectly? It skips bases with BAQ quality < 13, etc. The final pileup does not reflect the original reads exactly, but rather a BAQ-corrected version.



          I've generally found good results, better than trying to handle mapping-quality issues on my own ... but it seems odd to handle all that, and then *remove* the option that would make verification easier.

          Bio:B::Sam's pileup() call is a great tool for walking a pileup, while having access to ALL possible information, and it's way faster than running and then parsing the results of samtools pileup.
          Last edited by earonesty; 06-29-2011, 08:26 AM. Reason: add a title

          Comment

          • nilshomer
            Nils Homer
            • Nov 2008
            • 1283

            #6
            Originally posted by earonesty View Post
            mpileup is *supposed* to handle all the problems with mapping quality internally with the BAQ algorithm.
            BAQ adjusts the base qualities not the mapping qualities. The mapping qualities are set by the aligner/mapper.

            Comment

            • earonesty
              Member
              • Mar 2011
              • 52

              #7
              I thought the algorithm adjusted base qualitieis... based on mapping qualities. IE... it does it so you don't have to.

              Comment

              • nilshomer
                Nils Homer
                • Nov 2008
                • 1283

                #8
                Check out the paper: http://bioinformatics.oxfordjournals...s.btr076.short

                Anyhow, it uses the base qualities, not the mapping qualities. It resolves and weights the local ambiguities. Smith Waterman can be modeled as an HMM, with the BAQ running the forward/backward algorithms to compute the posterior probability of a query base and target base aligning based on all possible local alignments. This posterior is then converted to a base quality.

                Comment

                • ninjapanda57
                  Junior Member
                  • May 2011
                  • 3

                  #9
                  Solution

                  If you look at the MQ part of the info section in the VCF format that does equal mapping quality. The way that I extract that to work with the bed file is in perl

                  all you do is

                  call the line
                  my @snp_array = $snp =~ /^(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s+\S+\s+(\S+)\s+/;
                  my $mpileup_check = $snp_array[2];
                  my ($chr, $pos, $id, $ref, $var, $cns_qual, $rd_depth, $map_qual, $snp_qual, @extra);
                  if ($mpileup_check eq '.') {
                  ($chr, $pos, $id, $ref, $var, $cns_qual, @extra) =split("\t", $snp);
                  $extra[1] =~ /DP(\d+)/;
                  $rd_depth = $1;
                  $extra[1] =~ /MQ(\d+)/;
                  $map_qual = $1;
                  } else
                  that should work, that's just in perl though.

                  Hope this helps.

                  Comment

                  • earonesty
                    Member
                    • Mar 2011
                    • 52

                    #10
                    Originally posted by nilshomer View Post
                    Check out the paper: http://bioinformatics.oxfordjournals...s.btr076.short

                    Anyhow, it uses the base qualities, not the mapping qualities. It resolves and weights the local ambiguities. Smith Waterman can be modeled as an HMM, with the BAQ running the forward/backward algorithms to compute the posterior probability of a query base and target base aligning based on all possible local alignments. This posterior is then converted to a base quality.
                    Alignment quality is, effectively, summarized in the "mapping quality" MAQ statistic. BAQ is a "per base alignment quality",rather than a single statistic. The HMM model in BAQ is used to adjust base qualities , but the reason for adjustment is the alignment quality of each base.

                    The resulting pileup has lowered qualities on those bases whose alignment was ambiguous.

                    Using both MAQ and BAQ is, thus, a bit overlapped.

                    That's all I was trying to point out.

                    Comment

                    • hstehr
                      Junior Member
                      • Apr 2011
                      • 1

                      #11
                      -s option for mpileup does exist in samtools 0.1.17

                      In samtools 0.1.17 the mpileup command does indeed have the -s option, even though it is not mentioned in the manpage. It is listed under output options when typing 'samtools mpileup'. In version 0.1.16, the -s option is not available for mpileup, but only for pileup.

                      I hope this answers the original question.

                      However, I do not understand the encoding of the mapping qualities in the -s output. It does not seem to be in the usual Sanger/Phred encoding. Can anyone point me to where this is documented?

                      Comment

                      Latest Articles

                      Collapse

                      • seqadmin
                        New Genomics Tools and Methods Shared at AGBT 2025
                        by seqadmin


                        This year’s Advances in Genome Biology and Technology (AGBT) General Meeting commemorated the 25th anniversary of the event at its original venue on Marco Island, Florida. While this year’s event didn’t include high-profile musical performances, the industry announcements and cutting-edge research still drew the attention of leading scientists.

                        The Headliner
                        The biggest announcement was Roche stepping back into the sequencing platform market. In the years since...
                        03-03-2025, 01:39 PM
                      • seqadmin
                        Investigating the Gut Microbiome Through Diet and Spatial Biology
                        by seqadmin




                        The human gut contains trillions of microorganisms that impact digestion, immune functions, and overall health1. Despite major breakthroughs, we’re only beginning to understand the full extent of the microbiome’s influence on health and disease. Advances in next-generation sequencing and spatial biology have opened new windows into this complex environment, yet many questions remain. This article highlights two recent studies exploring how diet influences microbial...
                        02-24-2025, 06:31 AM

                      ad_right_rmr

                      Collapse

                      News

                      Collapse

                      Topics Statistics Last Post
                      Started by seqadmin, Yesterday, 05:03 AM
                      0 responses
                      16 views
                      0 reactions
                      Last Post seqadmin  
                      Started by seqadmin, 03-19-2025, 07:27 AM
                      0 responses
                      17 views
                      0 reactions
                      Last Post seqadmin  
                      Started by seqadmin, 03-18-2025, 12:50 PM
                      0 responses
                      18 views
                      0 reactions
                      Last Post seqadmin  
                      Started by seqadmin, 03-03-2025, 01:15 PM
                      0 responses
                      185 views
                      0 reactions
                      Last Post seqadmin  
                      Working...