Unconfigured Ad

**mgogol** · 05-02-2012, 01:22 PM

I haven't actually *run* it yet, but I'm talking about bowtie 2 at journal club and I don't think this is actually possible at this point. You'd have to do it by filtering the sam file, I think. MD tag?

The -N parameter controls the number of mismatches allowed per seed, but now we have overlapping seeds spaced at intervals.

**Dario1984** · 05-02-2012, 04:00 PM

I've been working with it this week. You can't set it as a parameter. The cutoff is based on a minimum score threshold and the score is a function of the number of matches and gaps, and their associated penalties.

**mgogol** · 05-03-2012, 10:19 AM

Oh, also look at the SAM optional field XM:i<N> which tells you the number of mismatches. (XO and XG tell number of gap opens and gap extensions, and NM is the edit distance).

**mihuzx** · 11-12-2013, 04:58 PM

Originally posted by hollandorange View Post

Hello,

Does anyone know how to set the parameters to align the reads with no more than 2 mismatches in bowtie2?

In Bowtie, the command line (-v is the parameter) is like the following:
>Bowtie ref -a -v 2 -f read.fa output.sam

How to record all the reads with no more than 2 mismatches in Bowtie2?

Thanks,
Yanju

hi,
had your problem been solved. but now i meet the same problem, could you please tell me how you extract the reads no more than 2 mismatch?
thanks a lot.

**mgogol** · 11-13-2013, 02:05 PM

I think you could do something by filtering on the field XM:i:0 and XM:i:1 and XM:i:2 from the sam file.

Probably something like:

samtools view | cut whatever column it is | grep "XM:i:0" > zero_mismatch.sam

and then do that for XM:i:1 and XM:i:2, then combine?

**gringer** · 11-13-2013, 03:17 PM

Originally posted by mgogol View Post

Code:

samtools view | cut whatever column it is | grep "XM:i:0" > zero_mismatch.sam

I'm pretty sure that the cut in there will mean that only that column is included in the output. The problem is also trickier because the optional fields are tab separated and not necessarily always in the same column. However, if you don't care about the string 'XM:i:X' appearing in the read name, then a regular expression filter should still work fine:

Code:

samtools view -Sh - | grep -e "^@" -e "XM:i:[012][^0-9]" > low_mismatch.sam

**mgogol** · 11-14-2013, 08:18 AM

Thanks for improving on my hasty and incorrect answer...

**mihuzx** · 11-15-2013, 03:29 AM

thanks a lot. from the answer, I think out a another solution using perl. the code is :
perl -ne "print if /XM:i:[0-2]/;" raw.sam >cleaned.sam

**gringer** · 11-15-2013, 01:14 PM

Originally posted by mihuzx View Post

thanks a lot. from the answer, I think out a another solution using perl. the code is :

Code:

perl -ne "print if /XM:i:[0-2]/;" raw.sam >cleaned.sam

You missed out the headers and haven't considered >9 mismatches (unlikely, but it could happen). The perl equivalent (using your syntax) of what I wrote is as follows:

Code:

perl -ne "print if((/XM:i:[0-2][^0-9]/) || (/^@/));" raw.sam >cleaned.sam

But if you're always going to use that filter you might as well just pipe straight from bowtie2 without making the intermediate 'raw.sam' file, as in my previous example.

**mihuzx** · 11-15-2013, 05:13 PM

thanks for your quick and well-thought answer.
I was always thinking about how to get the low_mismatch.sam directly,but failed. now I know there so many things to lean for me.
thanks again for your guidance.

Topics	Statistics	Last Post
High-Resolution Sequencing Exposes Hidden Toxoplasma Diversity by SEQadmin2 Started by SEQadmin2, 07-02-2026, 11:08 AM	0 responses 7 views 0 reactions	Last Post by SEQadmin2 07-02-2026, 11:08 AM
New AI Model Captures Long-Range Genomic Signals to Improve RNA Splice Site Prediction by SEQadmin2 Started by SEQadmin2, 06-30-2026, 05:37 AM	0 responses 12 views 0 reactions	Last Post by SEQadmin2 06-30-2026, 05:37 AM
Large-Scale Protein Screen Uncovers Hidden Regulators of Alternative Polyadenylation by SEQadmin2 Started by SEQadmin2, 06-26-2026, 11:10 AM	0 responses 20 views 0 reactions	Last Post by SEQadmin2 06-26-2026, 11:10 AM
Whole-Genome Sequencing Traces Faroe Islands Ancestry to a North Atlantic Founder Population by SEQadmin2 Started by SEQadmin2, 06-17-2026, 06:09 AM	0 responses 54 views 0 reactions	Last Post by SEQadmin2 06-17-2026, 06:09 AM

Unconfigured Ad

max mismatches in Bowtie2

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News