Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • hollandorange
    Member
    • May 2010
    • 11

    max mismatches in Bowtie2

    Hello,

    Does anyone know how to set the parameters to align the reads with no more than 2 mismatches in bowtie2?

    In Bowtie, the command line (-v is the parameter) is like the following:
    >Bowtie ref -a -v 2 -f read.fa output.sam

    How to record all the reads with no more than 2 mismatches in Bowtie2?

    Thanks,
    Yanju
  • mgogol
    Senior Member
    • Mar 2008
    • 197

    #2
    I haven't actually *run* it yet, but I'm talking about bowtie 2 at journal club and I don't think this is actually possible at this point. You'd have to do it by filtering the sam file, I think. MD tag?

    The -N parameter controls the number of mismatches allowed per seed, but now we have overlapping seeds spaced at intervals.

    Comment

    • Dario1984
      Senior Member
      • Jun 2011
      • 166

      #3
      I've been working with it this week. You can't set it as a parameter. The cutoff is based on a minimum score threshold and the score is a function of the number of matches and gaps, and their associated penalties.

      Comment

      • mgogol
        Senior Member
        • Mar 2008
        • 197

        #4
        Oh, also look at the SAM optional field XM:i<N> which tells you the number of mismatches. (XO and XG tell number of gap opens and gap extensions, and NM is the edit distance).

        Comment

        • mihuzx
          Member
          • Apr 2013
          • 20

          #5
          Originally posted by hollandorange View Post
          Hello,

          Does anyone know how to set the parameters to align the reads with no more than 2 mismatches in bowtie2?

          In Bowtie, the command line (-v is the parameter) is like the following:
          >Bowtie ref -a -v 2 -f read.fa output.sam

          How to record all the reads with no more than 2 mismatches in Bowtie2?

          Thanks,
          Yanju
          hi,
          had your problem been solved. but now i meet the same problem, could you please tell me how you extract the reads no more than 2 mismatch?
          thanks a lot.

          Comment

          • mgogol
            Senior Member
            • Mar 2008
            • 197

            #6
            I think you could do something by filtering on the field XM:i:0 and XM:i:1 and XM:i:2 from the sam file.

            Probably something like:

            samtools view | cut whatever column it is | grep "XM:i:0" > zero_mismatch.sam

            and then do that for XM:i:1 and XM:i:2, then combine?

            Comment

            • gringer
              David Eccles (gringer)
              • May 2011
              • 845

              #7
              Originally posted by mgogol View Post
              Code:
              samtools view | cut whatever column it is | grep "XM:i:0" > zero_mismatch.sam
              I'm pretty sure that the cut in there will mean that only that column is included in the output. The problem is also trickier because the optional fields are tab separated and not necessarily always in the same column. However, if you don't care about the string 'XM:i:X' appearing in the read name, then a regular expression filter should still work fine:

              Code:
              samtools view -Sh - | grep -e "^@" -e "XM:i:[012][^0-9]" > low_mismatch.sam

              Comment

              • mgogol
                Senior Member
                • Mar 2008
                • 197

                #8
                Thanks for improving on my hasty and incorrect answer...

                Comment

                • mihuzx
                  Member
                  • Apr 2013
                  • 20

                  #9
                  thanks a lot. from the answer, I think out a another solution using perl. the code is :
                  perl -ne "print if /XM:i:[0-2]/;" raw.sam >cleaned.sam

                  Comment

                  • gringer
                    David Eccles (gringer)
                    • May 2011
                    • 845

                    #10
                    Originally posted by mihuzx View Post
                    thanks a lot. from the answer, I think out a another solution using perl. the code is :
                    Code:
                    perl -ne "print if /XM:i:[0-2]/;" raw.sam >cleaned.sam
                    You missed out the headers and haven't considered >9 mismatches (unlikely, but it could happen). The perl equivalent (using your syntax) of what I wrote is as follows:

                    Code:
                    perl -ne "print if((/XM:i:[0-2][^0-9]/) || (/^@/));" raw.sam >cleaned.sam
                    But if you're always going to use that filter you might as well just pipe straight from bowtie2 without making the intermediate 'raw.sam' file, as in my previous example.
                    Last edited by gringer; 11-15-2013, 01:17 PM.

                    Comment

                    • mihuzx
                      Member
                      • Apr 2013
                      • 20

                      #11
                      thanks for your quick and well-thought answer.
                      I was always thinking about how to get the low_mismatch.sam directly,but failed. now I know there so many things to lean for me.
                      thanks again for your guidance.

                      Comment

                      Latest Articles

                      Collapse

                      • GATTACAT
                        Reply to Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                        by GATTACAT
                        Love this - good data definitely starts from good input, and poor input can only give relatively poor data. I particularly like the mention of Nanodrop/absorbance based methods for quantification. It's such a toss up if you'll get an accurate reading or what amounts to a randomly generated number, and a lot of library/sequencing related issues can be traced back to poor quant.
                        07-01-2026, 11:43 AM
                      • SEQadmin2
                        Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                        by SEQadmin2


                        I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

                        Here are nine questions we think about, in roughly the order they matter, before...
                        06-18-2026, 07:11 AM

                      ad_right_rmr

                      Collapse

                      News

                      Collapse

                      Topics Statistics Last Post
                      Started by SEQadmin2, 07-02-2026, 11:08 AM
                      0 responses
                      7 views
                      0 reactions
                      Last Post SEQadmin2  
                      Started by SEQadmin2, 06-30-2026, 05:37 AM
                      0 responses
                      12 views
                      0 reactions
                      Last Post SEQadmin2  
                      Started by SEQadmin2, 06-26-2026, 11:10 AM
                      0 responses
                      20 views
                      0 reactions
                      Last Post SEQadmin2  
                      Started by SEQadmin2, 06-17-2026, 06:09 AM
                      0 responses
                      54 views
                      0 reactions
                      Last Post SEQadmin2  
                      Working...