Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • metheuse
    Member
    • Jan 2013
    • 84

    How to do unique mapping in bowtie2

    Bowtie1 has a handy parameter -m that suppresses all alignments if more than one hit is found.
    But it seems that bowtie2 doesn't have an equivalent parameter. Instead, what it does is to report one alignment when there are mutli-hits, but this is not the strict definition of "unique" mapping.
    Did I miss anything?
  • dpryan
    Devon Ryan
    • Jul 2011
    • 3478

    #2
    Bowtie2 is sort of like bowtie1 with the -k option. If you want unique hits (which I take to mean alignments where the next best alignment isn't as good), then just keep any alignment with a MAPQ > 1. bowtie2 will give reads with more than one equally good alignment a score of 1 or 0, depending on whether there are any mismatches or not.

    Comment

    • metheuse
      Member
      • Jan 2013
      • 84

      #3
      Originally posted by dpryan View Post
      Bowtie2 is sort of like bowtie1 with the -k option. If you want unique hits (which I take to mean alignments where the next best alignment isn't as good), then just keep any alignment with a MAPQ > 1. bowtie2 will give reads with more than one equally good alignment a score of 1 or 0, depending on whether there are any mismatches or not.
      Thanks for replying. I think what bowtie1 -m 1 does is discarding a read if it can be matched to more than one location. But in bowtie2, -k 1 means it only searches for up to 1 hit. I think they are different.
      Do you mean from the bowtie2 (with -k 1) output, discard the alignment with MAPQ<=1? That's doable, but it would be more convenient if bowtie2 has this option by itself.
      Last edited by metheuse; 11-05-2013, 01:41 PM.

      Comment

      • dpryan
        Devon Ryan
        • Jul 2011
        • 3478

        #4
        Originally posted by metheuse View Post
        Thanks for replying. I think what bowtie1 -m 1 does is discarding a read if it can be matched to more than one location. But in bowtie2, -k 1 means it only searches for up to 1 hit. I think they are different.
        If you really want an "-m 1" equivalent, then just remove any reads with an "XS" auxiliary flag. It's rather odd to do that rather than just filter based on MAPQ, though.

        Comment

        • gringer
          David Eccles (gringer)
          • May 2011
          • 845

          #5
          Originally posted by metheuse View Post
          Do you mean from the bowtie2 (with -k 1) output, discard the alignment with MAPQ<=1? That's doable, but it would be more convenient if bowtie2 has this option by itself.
          No, this will work with the standard Bowtie2 options, regardless of how many alignments per read are done (including one). The filtering happens at the next step in the pipeline. FWIW, samtools view allows you to filter on MAPQ:

          Code:
          bowtie2 -x <index> {-1 <leftReads> -2 <rightReads> | -U <allReads>} | \
            samtools view -S [b]-q 2[/b]

          Comment

          • metheuse
            Member
            • Jan 2013
            • 84

            #6
            Originally posted by dpryan View Post
            If you really want an "-m 1" equivalent, then just remove any reads with an "XS" auxiliary flag. It's rather odd to do that rather than just filter based on MAPQ, though.
            Thanks. Yes, I understand filtering by MAPQ is a good idea.
            Where can I find the meaning of different MAPQ values? It seems that bowtie2 (or just tophat2?) changed the scales of it once. I always feel uncertain about this, but couldn't find an explicit one-to-one table.

            Comment

            • metheuse
              Member
              • Jan 2013
              • 84

              #7
              Originally posted by gringer View Post
              No, this will work with the standard Bowtie2 options, regardless of how many alignments per read are done (including one). The filtering happens at the next step in the pipeline. FWIW, samtools view allows you to filter on MAPQ:

              Code:
              bowtie2 -x <index> {-1 <leftReads> -2 <rightReads> | -U <allReads>} | \
                samtools view -S [b]-q 2[/b]
              Thanks. In this post someone explains the meaning of each MAPQ value: http://seqanswers.com/forums/showthr...highlight=mapq
              But perhaps it's not up-to-date. Is there an "official" place that explains the meaning of MAPQ values like this?
              255 = unique mapping

              3 = maps to 2 locations in the target

              2 = maps to 3 locations

              1 = maps to 4-9 locations

              0 = maps to 10 or more locations.

              Does MAPQ > 1 mean unique mapping? Or is there a more stringent threshold?

              Comment

              • dpryan
                Devon Ryan
                • Jul 2011
                • 3478

                #8
                Originally posted by metheuse View Post
                Thanks. Yes, I understand filtering by MAPQ is a good idea.
                Where can I find the meaning of different MAPQ values? It seems that bowtie2 (or just tophat2?) changed the scales of it once. I always feel uncertain about this, but couldn't find an explicit one-to-one table.
                For tophat2, I recall the only MAPQ values you'll ever see are 0, 1, 2, 3 and 50 (previously, 255). 50 (previously 255) means a unique hit, while 3 means 2 equal hits, 2 means 3 equal hits, etc. For bowtie2, there's no simple explanation. Somewhere on here (or maybe biostars), I've posted the C version of bowtie2's MAPQ calculator that I use in bison (a bisulfite aligner for compute clusters). You can probably search around for that if you really want the gorey details (otherwise, it's in the source for bison on sourceforge).

                Comment

                • dpryan
                  Devon Ryan
                  • Jul 2011
                  • 3478

                  #9
                  I should qualify the bowtie2 part of my most recent reply. In bowtie2 a MAPQ>1 means a unique hit (i.e., the reported hit is better than the next best hit). Beyond that, the actual calculation of the MAPQ is pretty messy.

                  Comment

                  • metheuse
                    Member
                    • Jan 2013
                    • 84

                    #10
                    Originally posted by dpryan View Post
                    I should qualify the bowtie2 part of my most recent reply. In bowtie2 a MAPQ>1 means a unique hit (i.e., the reported hit is better than the next best hit). Beyond that, the actual calculation of the MAPQ is pretty messy.
                    Thanks! Sorry, I didn't mean to ignore your previous reply. I was just wondering if there is a more complete resource or something. But that's fine! I appreciate your answer!

                    Comment

                    Latest Articles

                    Collapse

                    • SEQadmin2
                      Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                      by SEQadmin2


                      I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

                      Here are nine questions we think about, in roughly the order they matter, before...
                      06-18-2026, 07:11 AM
                    • SEQadmin2
                      From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                      by SEQadmin2


                      Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                      The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                      ...
                      06-02-2026, 10:05 AM

                    ad_right_rmr

                    Collapse

                    News

                    Collapse

                    Topics Statistics Last Post
                    Started by SEQadmin2, Today, 05:37 AM
                    0 responses
                    5 views
                    0 reactions
                    Last Post SEQadmin2  
                    Started by SEQadmin2, 06-26-2026, 11:10 AM
                    0 responses
                    16 views
                    0 reactions
                    Last Post SEQadmin2  
                    Started by SEQadmin2, 06-17-2026, 06:09 AM
                    0 responses
                    49 views
                    0 reactions
                    Last Post SEQadmin2  
                    Started by SEQadmin2, 06-09-2026, 11:58 AM
                    0 responses
                    109 views
                    0 reactions
                    Last Post SEQadmin2  
                    Working...