Seqanswers Leaderboard Ad

**florian** · 01-07-2009, 09:42 AM

hi there

as of version 0.9.8 bowtie comes with the --unfa <filename> / --unfq <filename> options, which should be doing exactly what you are looking for

Bowtie: Manual

http://bowtie-bio.sourceforge.net/manual.html#algn_unfa

cheers,

florian

**xwu** · 01-07-2009, 09:48 AM

Florian, thanks a lot for your reply. I was using v0.9.7, and did not know there is a new version available. It is a very helpful function. Thanks.

**florian** · 01-07-2009, 09:53 AM

Yeh, I agree. However, there seems to be a problem with multi-threading, so be advised NOT to use the -p option in conjunction with it. Ben (the main developer) has told me he was already working on a solution, though, so this will only be a intermittent drawback.

**Ben Langmead** · 01-07-2009, 10:35 AM

Thanks Florian! - Indeed, version 0.9.8.1 of Bowtie was just released minutes ago and it fixes all known issues with the --unfa and --unfq options. It's highly recommended that you use that version.

Thanks,
Ben

**Ben Langmead** · 01-07-2009, 10:50 AM

Hello Joachim,

Sorry for the delay in replying! Please see responses below.

Originally posted by joa_ds View Post

[*]I don't quite understand what the 'nostrattum' flag does

Without the --nostratum flag, Bowtie will only report alignments for the best "stratum" where alignments were found. By "best stratum", I mean the best category of alignment, categorized by number of mismatches in the seed region. Say you use -k 3 and a given read aligns once with 1 mismatch in the seed and twice with 2 mismatches in the seed. If you do *not* specify --nostratum (the default), then Bowtie will only report the single 1-mismatch hit. If you do specify --nostratum, Bowtie will report all 3 hits.

I'll make a note to add a clear example in the documentation for future releases.

[*]I am only interested in rather unique maps, the rest can go to another file and i can have a look at it later. the --unfa flag moves unmapped seqs to a file, but when i use -m 3 i will discard seqs that map more than 3 times, right? Those go that same file? or they are lost forever? The idea is that i want to do a preliminary analyses fast and i can remap those multimaps overnight or during a weekend when the server is not used.

(Let me answer your question with respect to the just released 0.9.8.1 version of Bowtie, since version 0.9.8 had issues with --unfa and --unfq.)

As of 0.9.8.1 Bowtie supports --maxfa/--maxfq options that dump reads that exceed the -m limit to a separate file. If --maxfa/--maxfq is not specified but --unfa/--unfq is, then these reads are dumped to the same file as the reads that don't align at all.

[*]If i use the -k 3 flag, i want to report 3 maps, will it take the first 3 it encounters? And if i use the --best flag, will it go find all the possible maps and only report the best 3?

./bowtie -k 3 -m 10 --best --unfa MSC_bowtie_unal_fasta human_genome ../files/file.111.fastq MSC_bowtie

is the commando i want to use. I hope it will find max 10 maps per sequence and report the best 3 (combining -k 3 and --best) Will this work? Just experimenting... In a later stage i will map everything(even x100 repeats) and output it to a db, so it doesnt really matter if it doesnt work, just trying to understand the program completely.

Based on what you say, yes, that command will do what you intend. -k 3 --best should guarantee that you get up to three alignments of the "best" kind (best in terms of # of mismatches in the seed) and -m 10 will ensure that no alignments are reported for a read that aligns to more than 10 places. If you don't care whether the alignments come from the same strata, then you should also use "--nostrata".

Hope that helps.

Thanks,
Ben

**xwu** · 01-12-2009, 09:33 AM

I should have read Ben's post earlier. I spent a lot of time to find out that the sequences are all reversed, which caused weird results. I will try out the 0.9.8.1 now.

**doxologist** · 01-12-2009, 09:57 AM

Does Bowtie work with colorspace data?

**Ben Langmead** · 01-12-2009, 01:39 PM

Originally posted by doxologist View Post

Does Bowtie work with colorspace data?

Not yet - that's on the TODO list but we're going to tackle paired-end alignment and gapped alignments first.

Thanks,
Ben

**doxologist** · 01-12-2009, 01:52 PM

thanks. looking forward to it.

**zee** · 01-13-2009, 04:34 AM

HI Ben/Florian

I was using bowtie and I wanted to compare the mapping qualities when converted to .map format. Would these mapping quality scores be comparable to that for MAQ results?

**florian** · 01-13-2009, 10:09 AM

hi zee

sorry, if i gave you the wrong impression, but i'm actually not a developer of bowtie. i cannot claim any of this fame -- unfortunately :-D.
as to the question, i'll better leave the answer to ben, as i'm not sure about the answer either.

cheers,

f

**Cole Trapnell** · 01-13-2009, 01:25 PM

Originally posted by zee View Post

I was using bowtie and I wanted to compare the mapping qualities when converted to .map format. Would these mapping quality scores be comparable to that for MAQ results?

Bowtie's mapping qualities are only a rough approximation of Maq's. Maq computes mapping quality like this:

Q = min {q_2 - q_1 - 4.343log(n_2), 4 + (3-k')(q_bar - 14) - 4.343log(p_1(3-k,28)) }

Where:

q_2 is the quality-weighted Hamming distance of the best hit, q_2 is the quality-weighted Hamming distance of the second best hit, q_bar is the average quality value on the 5' end of the read, and "p1(k,28) is the probability that a perfect hit and a k-mismatch hit coexists given a 28bp sequence which can be estimated during alignment" (from the Maq paper).

Bowtie, as discussed previously in this thread, doesn't guarantee that it will find the best hit, and by default, won't even continue searching for the second best one. So Bowtie can't really compute Mapping quality this way. Instead, our Maq converter (which was derived from Heng Li's ELAND converter) calculates mapping qualities as follows:

Q = (3 - k) * 25 - log(# of other equally good occurances found by Bowtie).

Where k is the number of mismatches in the seed region of the alignment. This is not as nice as Maq's method. However, it works without forcing Bowtie to be used in one of it's slower modes (--all, for example). In our tests so far, Maq's assembler handles qualities computed this way pretty well and produces good SNP calls.

**zee** · 01-13-2009, 09:30 PM

Cole,

Thanks for that clarification. I found that some work I was doing with bowtie seemed to produce a lot more results than what I expected, even when I did use the '--best' option for reporting hits. When I looked at how it stacked up against some other aligners, I felt that there was some kind of overestimation when using the mapping quality score to evaluate good quality SNPs, correct alignments.
I would just need to be cautious how I interpret bowtie's results with respect to metric's derived from other aligners.

**foram** · 01-20-2009, 04:50 PM

Does anyone have any recommendations for scoring params when mapping long (76bp Illumina) reads?

Also, my reads are PE -- any chance this will be supported soon?

**Ben Langmead** · 01-21-2009, 06:47 AM

Hi Foram,

I would try upping both the seed length (-l) and the error tolerance (-e). Others may have better suggestions, though. If you find parameters you're happy with, please do post them back here since that will help others.

I'm working on paired-end support currently. Expect it in a few weeks or so.

Thanks,
Ben

Originally posted by foram View Post

Does anyone have any recommendations for scoring params when mapping long (76bp Illumina) reads?

Also, my reads are PE -- any chance this will be supported soon?

Topics	Statistics	Last Post
New Model Aims to Explain Polygenic Diseases by Connecting Genomic Mutations and Regulatory Networks by seqadmin Started by seqadmin, Yesterday, 05:31 AM	0 responses 10 views 0 likes	Last Post by seqadmin Yesterday, 05:31 AM
Small Blood Stem Cell Subset Linked to Immune System Aging by seqadmin Started by seqadmin, 10-24-2024, 06:58 AM	0 responses 20 views 0 likes	Last Post by seqadmin 10-24-2024, 06:58 AM
New AI Model Designs Synthetic DNA Switches for Targeted Gene Expression in Specific Cell Types by seqadmin Started by seqadmin, 10-23-2024, 08:43 AM	0 responses 48 views 0 likes	Last Post by seqadmin 10-23-2024, 08:43 AM
Microbes in Urban Spaces Adapt to Disinfectants and Scarce Resources by seqadmin Started by seqadmin, 10-17-2024, 07:29 AM	0 responses 58 views 0 likes	Last Post by seqadmin 10-17-2024, 07:29 AM

Seqanswers Leaderboard Ad

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News