Seqanswers Leaderboard Ad

**Xi Wang** · 10-29-2009, 11:48 PM

Hi Ben,

I am confused how Bowtie deals with the quality scores when counting mismatches.

I noticed that there are two parameters related to this issue. First, -n/--seedmms <int> indicates the maximum mismatches in seed, meaning that if a hit with greater than the mismatch cutoff it will not be reported by Bowtie. And second, -e/--maqerr <int> indicates the maximum sum of quality scores allowed at the mismatched bases (is it right?). However, I don't know whether the two criteria are the same or complemental.

Further, the two measurements of mismatches are both counted in seed region. Even though the users can specify the seed length, I am wondering where does the seed locate: from the leftmost of a query (read) or a random region in the query.

Besides, there is another parameter -v <int>, which takes care the end-to-end mismatches, but does not consider the quality scores. Is it possible to make this consider the quality scores?

Best regards!
Xi

**lindseyjane** · 11-02-2009, 02:01 AM

Question regarding bwt paired end alignment

I am currently trying to aligned paired end Illumina reads using bowtie and I want to compare the results to those from maq.

I cannot see an option for reporting an alignment for a read when its mate does not map? Is this possible?

The maq software still reports alignments for a read even if its mate does not map and I wanted to do the same thing with bowtie. A lot of pairs end up unaligned (significantly more than with maq) if this is not possible.

If any one knows hows to do this I would really appreciate it, thanks.

**Ben Langmead** · 11-02-2009, 05:43 AM

Hi Xi,

Originally posted by Xi Wang View Post

I noticed that there are two parameters related to this issue. First, -n/--seedmms <int> indicates the maximum mismatches in seed, meaning that if a hit with greater than the mismatch cutoff it will not be reported by Bowtie. And second, -e/--maqerr <int> indicates the maximum sum of quality scores allowed at the mismatched bases (is it right?). However, I don't know whether the two criteria are the same or complemental.

They're complementary. If either limit is exceeded, the alignment is invalid.

Originally posted by Xi Wang View Post

Further, the two measurements of mismatches are both counted in seed region. Even though the users can specify the seed length, I am wondering where does the seed locate: from the leftmost of a query (read) or a random region in the query.

From the leftmost end of the read. -e applies to the entire alignment, not just the seed, exactly as in Maq.

Originally posted by Xi Wang View Post

Besides, there is another parameter -v <int>, which takes care the end-to-end mismatches, but does not consider the quality scores. Is it possible to make this consider the quality scores?

No; to consider qualities, use -n/-l/-e.

Thanks,
Ben

**Ben Langmead** · 11-02-2009, 05:45 AM

Originally posted by lindseyjane View Post

I cannot see an option for reporting an alignment for a read when its mate does not map? Is this possible?

Your best bet is to run Bowtie in paired-end mode while using --un to dump unaligned reads to files. Then run again in unpaired mode using the unaligned reads as input.

Let me know if that doesn't solve your problem.

Thanks,
Ben

**Layla** · 11-02-2009, 09:16 AM

comparable parameters with maq

Hi Ben,

Excellent work with Bowtie - looking forward to cutting down data processing time. Working on a project in which I have used maq, but for subsequent paired end medip-seq of 45 bases I want to use Bowtie and parameters as close to maq as possible.

Using maq I eliminate reads with a maq quality < 10 (the same read mapped to >1 location and hence ambiguous) and output to another file.
I also keep only those flags 18 and 130 (correctly paired reads).
Using ad-hoc script I only keep one hit if the same read is mapped to the same start and stop location multiple times (pcr bias)

I'd like to create the same criteria using bowtie. Could you advise me? To begin with, the default in bowtie is good - 2MM in 28 base seed region with sum of e 70

thank you

Layla

**Xi Wang** · 11-02-2009, 07:24 PM

Originally posted by Ben Langmead View Post

Hi Xi,

to consider qualities, use -n/-l/-e.

Thanks, Ben.
I am still wondering whether the seed region is defined only for counting the mismatches or not. If I want to just use the quality score criterion, and set -l equal to 0, does it work?

Best wishes,
Xi

**Ben Langmead** · 11-02-2009, 07:29 PM

Originally posted by Xi Wang View Post

I am still wondering whether the seed region is defined only for counting the mismatches or not.

Yes. The setting for -l matters for the -n limit but not for the -e limit.

Originally posted by Xi Wang View Post

If I want to just use the quality score criterion, and set -l equal to 0, does it work?

No, -l must be set to 5 or greater.

Ben

**ramouz87** · 11-03-2009, 07:15 AM

Hi,
I'm New in the field of NGS (was working mainly on microarray data analysis) and i'm starting to invastigate comon tools related to sequence analysis.
I have human data (paired reads/ 75 base) and used Bowtie for the alignment.
I used standard parameter for alignment :
bowtie -t -p 8 h_sapiens_37_asm ./s_8_1_sequence.fq ./s_8_1_sequence.fq.bowtie.align
bowtie -t -p 8 h_sapiens_37_asm ./s_8_2_sequence.fq ./s_8_2_sequence.fq.bowtie.align
bowtie -t -p 8 h_sapiens_37_asm -1 ./s_8_1_sequence.fq -2 ./s_8_2_sequence.fq ./s_8_sequence.fq.bowtie.align

and I get respectively the following results:
# reads processed: 6660511
# reads with at least one reported alignment: 4615451 (69.30%)
# reads that failed to align: 2045060 (30.70%)
# reads with at least one reported alignment: 5050548 (75.83%)
# reads that failed to align: 1609963 (24.17%)
# reads with at least one reported alignment: 13371 (0.20%)
# reads that failed to align: 6647140 (99.80%)

The data quality is not optimal but i guess that having no alignment using paired end is not due to that fact and probably parameter should be tuned.
Any one could give me some insight about the optimal setting for the paired end alignment ?
Thanks in advance,
Best,
ramzi

**liu3zhen** · 11-04-2009, 12:58 PM

A question for number of mismatches. I can not set up -v 4. (error: -v arg must be at most 3) Does that mean Bowtie at most allow 3 mismatches for whatever length of reads? Thanks.

**liu3zhen** · 11-04-2009, 01:28 PM

Another question:

I'm reading the manual for -k -a and --best.

I'm confusing about if we put (-k or -a) with --best together. I thought that if a read has several "best" alignments, these "best" should have kinds of "equal" alignment scores. But the manual said that if -k or -a >1 and --best are specified, only best alignments will be reported and they are appear in best-to-worst order, which means that the best alignments are not "equally best".

Hopefully get your help soon, thanks.

**Ben Langmead** · 11-04-2009, 01:32 PM

Originally posted by ramouz87 View Post

The data quality is not optimal but i guess that having no alignment using paired end is not due to that fact and probably parameter should be tuned.
Any one could give me some insight about the optimal setting for the paired end alignment ?
Thanks in advance,
Best,
ramzi

Hi Ramzi,

The options you're looking for are almost certainly -I/-X and --ff/--fr/--rf. You need to have a reasonably good idea of the expected insert size and specify an appropriate range with -I/-X. You should also confirm that your paired-end protocol produces pairs in the fw/rev orientation. This is the typical configuration for Illumina. If your paired-end data has a different orientation, change it with --ff or --rf.

Hope that helps,
Ben

**Ben Langmead** · 11-04-2009, 01:33 PM

Originally posted by liu3zhen View Post

A question for number of mismatches. I can not set up -v 4. (error: -v arg must be at most 3) Does that mean Bowtie at most allow 3 mismatches for whatever length of reads? Thanks.

Hi liu3zhen,

To allow more than 3 mismatches in the alignment, use the Maq-like options: -n/-l/-e instead of -v.

Thanks,
Ben

**ecabot** · 11-04-2009, 01:34 PM

are pairs considered separately wrt mismatches and uniquness with soap-like policy

I have a couple of questions about how Bowtie deals with mismatches in a paired end run. (Using -v 1 and -m 1). I have my guesses as to how things work, but I am hoping that someone knowlegeable (e.g. Ben) will ring-in with the correct information.

1) Is it possible to obtain an alignment for a read pair where one read uniquely maps but the other doesn't? (my guess: no)

2) Does the mismatch setting apply to both reads or are they taken together. In other words if 1 mismatch is specified, can both members of a pair each have 1-mismatch? (my guess: yes)

**Ben Langmead** · 11-04-2009, 01:35 PM

Originally posted by liu3zhen View Post

But the manual said that if -k or -a >1 and --best are specified, only best alignments will be reported and they are appear in best-to-worst order, which means that the best alignments are not "equally best".

That's right; --best does not limit the number of alignments Bowtie reports. If you ask for 1 alignment (default), --best guarantees it's the best. If you ask for -k 4, --best guarantees they're the 4 best, reported in best-to-worst order. If you ask for -a, --best guarantees that you'll get all of them in best-to-worst order.

Thanks
Ben

**Ben Langmead** · 11-04-2009, 01:39 PM

Originally posted by ecabot View Post

1) Is it possible to obtain an alignment for a read pair where one read uniquely maps but the other doesn't? (my guess: no)

Definitely yes! That's exactly where paired-end sequencing pays off

. If either read aligns uniquely, that alignment will be used as an anchor to look for the mate's alignment and, if it's found, that paired-end alignment will be reported.

Originally posted by ecabot View Post

2) Does the mismatch setting apply to both reads or are they taken together. In other words if 1 mismatch is specified, can both members of a pair each have 1-mismatch? (my guess: yes)

The mismatch setting applies to each read. So, yes, if -v 1 is specified, *both* mates are allowed to have a mismatch.

Hope that helps,
Ben

Topics	Statistics	Last Post
New Model Aims to Explain Polygenic Diseases by Connecting Genomic Mutations and Regulatory Networks by seqadmin Started by seqadmin, Yesterday, 05:31 AM	0 responses 10 views 0 likes	Last Post by seqadmin Yesterday, 05:31 AM
Small Blood Stem Cell Subset Linked to Immune System Aging by seqadmin Started by seqadmin, 10-24-2024, 06:58 AM	0 responses 20 views 0 likes	Last Post by seqadmin 10-24-2024, 06:58 AM
New AI Model Designs Synthetic DNA Switches for Targeted Gene Expression in Specific Cell Types by seqadmin Started by seqadmin, 10-23-2024, 08:43 AM	0 responses 50 views 0 likes	Last Post by seqadmin 10-23-2024, 08:43 AM
Microbes in Urban Spaces Adapt to Disinfectants and Scarce Resources by seqadmin Started by seqadmin, 10-17-2024, 07:29 AM	0 responses 58 views 0 likes	Last Post by seqadmin 10-17-2024, 07:29 AM

Seqanswers Leaderboard Ad

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News