Bowtie, an ultrafast, memory-efficient, open source short read aligner

inesdesantiago replied

06-30-2009, 01:57 AM
Hello,

Originally posted by kcook View Post

Hi all,

I'm using Bowtie to map some RNA-seq data, and I wanted to clarify my understanding of a couple points.

The behaviour of -m 1 with default (0.10.0) parameters will only report results for which there is only one alignment anywhere within the 2-mismatch limit, right? So if there is an alignment with one mismatch and one with two, nothing will be reported. And if --strata is on, then the one-mismatch alignment will be reported (as long as there is only a single alignment with one mismatch). Is that all correct?

Also, the rounding of quality values to between 10 and 30 means that there is no combination of two mismatches that give a total quality score of 70, so in effect the quality scores only affect the order of the results returned (which doesn't apply when -m 1 is on anyway). Have I got that right?

Thanks a lot, and I apologize if any of this is explained in the manual or otherwise obvious.

Kate

kcook, you illuminated me! Now I anderstand the -m 1 better!
Leave a comment:
kcook replied

06-29-2009, 09:15 AM
Great! Thanks for the quick reply.
Leave a comment:
Ben Langmead replied

06-29-2009, 09:13 AM
Hi Kate,

Originally posted by kcook View Post

Hi all,
The behaviour of -m 1 with default (0.10.0) parameters will only report results for which there is only one alignment anywhere within the 2-mismatch limit, right? So if there is an alignment with one mismatch and one with two, nothing will be reported. And if --strata is on, then the one-mismatch alignment will be reported (as long as there is only a single alignment with one mismatch). Is that all correct?

Yes - that's all correct.

Also, the rounding of quality values to between 10 and 30 means that there is no combination of two mismatches that give a total quality score of 70, so in effect the quality scores only affect the order of the results returned (which doesn't apply when -m 1 is on anyway). Have I got that right?

The quality ceiling only applies in the -n ("Maq-like") alignment mode. So your statement is still correct for -v 2, but it's also the case that in -v 3 mode, alignments with combined mismatch qualities exceeding 70 are valid.

I hope that helps,
Ben
Leave a comment:
kcook replied

06-29-2009, 09:04 AM
Hi all,

I'm using Bowtie to map some RNA-seq data, and I wanted to clarify my understanding of a couple points.

The behaviour of -m 1 with default (0.10.0) parameters will only report results for which there is only one alignment anywhere within the 2-mismatch limit, right? So if there is an alignment with one mismatch and one with two, nothing will be reported. And if --strata is on, then the one-mismatch alignment will be reported (as long as there is only a single alignment with one mismatch). Is that all correct?

Also, the rounding of quality values to between 10 and 30 means that there is no combination of two mismatches that give a total quality score of 70, so in effect the quality scores only affect the order of the results returned (which doesn't apply when -m 1 is on anyway). Have I got that right?

Thanks a lot, and I apologize if any of this is explained in the manual or otherwise obvious.

Kate
Leave a comment:
ewingad replied

06-23-2009, 11:34 AM
Originally posted by Ben Langmead View Post

Absolutely, as long as you're using -k 2 in an unstratified reporting mode (the default in 0.10.0). Obviously, stratified -k 2 is not a good proxy for unstratified -m 1.

I would be surprised if unstratified -k 2 performed all that differently from unstratified -m 1, since what's going on under the hood is essentially the same. Do you have an example where it is? If so, I should take a look.

Ben

Actually now that I benchmark it, -m 1 is slightly faster than -k 2 using 0.10.0.

-Adam
Leave a comment:
Ben Langmead replied

06-23-2009, 10:39 AM
Originally posted by ewingad View Post

Would it also be valid use the -k 2 option and throw out reads for which two alignments are reported? This is slower than alignment against a masked genome but faster than -m 1.

Absolutely, as long as you're using -k 2 in an unstratified reporting mode (the default in 0.10.0). Obviously, stratified -k 2 is not a good proxy for unstratified -m 1.

I would be surprised if unstratified -k 2 performed all that differently from unstratified -m 1, since what's going on under the hood is essentially the same. Do you have an example where it is? If so, I should take a look.

Ben
Leave a comment:
ewingad replied

06-23-2009, 10:32 AM
Originally posted by Ben Langmead View Post

Note that a way to make this alignment scenario (-m 1 without --best --strata) far more efficient is to use a repeat-masked reference index and omit the -m 1 option.

Ben

Would it also be valid use the -k 2 option and throw out reads for which two alignments are reported? This is slower than alignment against a masked genome but faster than -m 1.
Leave a comment:
Ben Langmead replied

06-17-2009, 06:22 AM
Originally posted by chuck View Post

Wanted to report that bowtie does not ever 'finish', i.e. return the command line prompt and in 'top' it reports as still active, even though it has not written anything to file in a long time.

Hi Chuck,

Please post the exact Bowtie version and arguments you're using. Also, please let me know if you see this problem when you use the latest version of Bowtie (0.10.0).

Thanks,
Ben
Leave a comment:
Ben Langmead replied

06-17-2009, 05:27 AM
Originally posted by inesdesantiago View Post

When I set “–m 1” bowtie will now suppress all alignments for a particular read with more then 1 alignments (while previously it doesn’t suppress any alignment). I believe that, by setting a limit to –m, bowtie has to process more information and thus takes more time..

Note that a way to make this alignment scenario (-m 1 without --best --strata) far more efficient is to use a repeat-masked reference index and omit the -m 1 option.

Ben
Leave a comment:
chuck replied

06-16-2009, 11:33 PM
Hi Ben,

Wanted to report that bowtie does not ever 'finish', i.e. return the command line prompt and in 'top' it reports as still active, even though it has not written anything to file in a long time.

I saw this first on one machine but thought it was just something wrong with my install but now I have seen it on two machines. Are you familiar with this? I suppose it could still be something about my install.

I am running it in Ubuntu 8.04 on an IBM Intellistation (64-bit machine).

Chuck
Leave a comment:
inesdesantiago replied

06-16-2009, 05:48 PM
Hi lh3, you are right. The speed is dependent on the information it has to report.

For instance, I tried to run bowtie with the parameter -m set to 1 and it took 3hours, while previously it was taken 15min. I think it is very impressive that bowtie can do the alignment in 15 minutes. When I set “–m 1” bowtie will now suppress all alignments for a particular read with more then 1 alignments (while previously it doesn’t suppress any alignment). I believe that, by setting a limit to –m, bowtie has to process more information and thus takes more time..

Quote:
Originally Posted by lh3
The speed of Eland/Maq will remain the same if we do not ask them to report the counts because they check them anyway, but the speed of Bowtie/SOAP2/BWA will be reduced a lot..

Perhaps MAQ should have some more optional parameters. For e.g. we could choose to have all the reports we have now in MAQ, or we could choose to make it faster with less counts and so on… This would be great.
Leave a comment:
inesdesantiago replied

06-16-2009, 05:45 PM
Faster search for Bowtie

Hi lh3, you are right. The speed is dependent on the information it has to report.

For instance, I tried to run bowtie with the parameter -m set to 1 and it took 3hours, while previously it was taken 15min. I think it is very impressive that bowtie can do the alignment in 15 minutes. When I set “–m 1” bowtie will now suppress all alignments for a particular read with more then 1 alignments (while previously it doesn’t suppress any alignment). I believe that, by setting a limit to –m, bowtie has to process more information and thus takes more time..

Originally posted by lh3 View Post

The speed of Eland/Maq will remain the same if we do not ask them to report the counts because they check them anyway, but the speed of Bowtie/SOAP2/BWA will be reduced a lot..

Perhaps MAQ should have some more optional parameters. For e.g. we could choose to have all the reports we have now in MAQ, or we could choose to make it faster with less counts and so on… This would be great.

Last edited by inesdesantiago; 06-16-2009, 05:50 PM.
Leave a comment:
ieuanclay replied

06-16-2009, 08:03 AM
I'll have a look, and no i don't mind sharing, as long as you promise not to laugh!

Ieuan
Leave a comment:
Ben Langmead replied

06-16-2009, 05:09 AM
Hi Ieuan,

If you don't mind sharing it, sure, please send it along. Note that the --mm option in the 0.10.0 release of Bowtie might be helpful to you if you're (a) running many concurrent bowtie processes that are searching against the same large index, and (b) memory is tight.

Thanks,
Ben
Leave a comment:
Ben Langmead replied

06-16-2009, 05:03 AM
Originally posted by ieuanclay View Post

also, i just noticed a typo in the manual (0.9.9.3) : for the --fr/rf/ff docs, you call --ff --ll. Maybe this isn't a typo and i am just being really dumb...

You're absolutely right. Sorry about that. I just fixed it on the web version of the manual and in the Bowtie repository. The fix to the MANUAL file in the Bowtie download will be reflected in the next release (after 0.10.0).

Thanks,
Ben
Leave a comment:

Previous 1 15 22 23 24 25 26 27 28 34 template Next

Recent Advances in Sequencing Analysis Tools

by seqadmin

The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
- Channel: Articles
05-06-2024, 07:48 AM

Topics	Statistics	Last Post
New Milestone for COSMIC with Extensive Cancer Mutation Data by seqadmin Started by seqadmin, Today, 02:06 PM	0 responses 6 views 0 likes	Last Post by seqadmin Today, 02:06 PM
The Role of Spliceosomes in RNA Splicing and Genome Evolution by seqadmin Started by seqadmin, 05-14-2024, 07:03 AM	0 responses 27 views 0 likes	Last Post by seqadmin 05-14-2024, 07:03 AM
A Closer Look at the Enigmatic Genomes of Oikopleura dioica by seqadmin Started by seqadmin, 05-10-2024, 06:35 AM	0 responses 47 views 0 likes	Last Post by seqadmin 05-10-2024, 06:35 AM
Advanced Epigenome Editing Platform Explores Gene Regulation Mechanisms by seqadmin Started by seqadmin, 05-09-2024, 02:46 PM	0 responses 59 views 0 likes	Last Post by seqadmin 05-09-2024, 02:46 PM

Seqanswers Leaderboard Ad

Announcement

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Latest Articles

ad_right_rmr

News