Bowtie, an ultrafast, memory-efficient, open source short read aligner

ieuanclay replied

06-16-2009, 05:02 AM
Great, thanks. I am writing an old skool perl GUI wrapper to handle all the input and forking off several alignment runs (separate from the internal forking that bowtie already does, i.e. if i want to align 4 files, it will run them as parallel children, keeping any output separate), and saving/loading the parameters i used. It is really only for my own use, but I can send it to you if you are interested? Please tell me to bugger off if i am stepping on toes!

Ieuan
Leave a comment:
Ben Langmead replied

06-16-2009, 04:56 AM
Hi Ieuan,

Originally posted by ieuanclay View Post

Quick question - if you supply paired end arguments (--ff / -I / etc...) , but only supply a <singles> file (not <mates1/2>), will the PE args just be ignored?

Yes, they will.

Originally posted by ieuanclay View Post

Similarly, if I specify --maxbts 400 (for example) as well as --best, will the --best overrule the --maxbts?

--best and --maxbts are compatible, so, no, --best does not overrule it. -y and --maxbts are mutually exclusive. If both are specified, -y will prevail.

Thanks,
Ben
Leave a comment:
ieuanclay replied

06-15-2009, 10:30 AM
also, i just noticed a typo in the manual (0.9.9.3) : for the --fr/rf/ff docs, you call --ff --ll. Maybe this isn't a typo and i am just being really dumb...

Ieuan
Leave a comment:
ieuanclay replied

06-15-2009, 10:18 AM
Hi Ben,

Quick question - if you supply paired end arguments (--ff / -I / etc...) , but only supply a <singles> file (not <mates1/2>), will the PE args just be ignored?

Similarly, if I specify --maxbts 400 (for example) as well as --best, will the --best overrule the --maxbts?

Cheers,

Ieuan

Last edited by ieuanclay; 06-15-2009, 10:39 AM.
Leave a comment:
lh3 replied

06-15-2009, 06:26 AM
In fact, Bowtie, as well as the other two BWT-based aligners, gives less information than Eland and Maq: information on suboptimal hits (e.g. the count of 1-mismatch and 2-mismatch hits). This is one of key factors that make Bowtie faster. The speed of Eland/Maq will remain the same if we do not ask them to report the counts because they check them anyway, but the speed of Bowtie/SOAP2/BWA will be reduced a lot. Probably they will be slower than Eland if we ask them to always do this counting for 32bp reads. Fortunately, the count of 2-mismatch hits is not frequently used and using this information or not does not affect SNP accuracy too much (but will affect a little). To this end, BWT-based aligners trades some minor information (and a little bit accuracy as well) for a great speed given high-quality reads. In all, it is worth trying BWT-based aligners.
Leave a comment:
Ben Langmead replied

06-15-2009, 05:28 AM
Hi Ines,

No, less memory != fast search. Not necessarily. (If the difference in footprint allows the problem to fit into a faster memory, then yes, less memory == fast search.) In this case, the BWT technique offers a combination of small memory footprint (far smaller than suffix arrays, suffix trees, and smaller than hash tables when the tables are built over the reference genome), and good performance. People often ask why BWT is faster than hash tables in certain situations, and it's hard to answer because so much depends on exactly what hash-based tool you're comparing against and what the reads and alignment policy look like. I suspect it chiefly comes down to minimizing cache misses and minimizing wasted work.

Thanks,
Ben
Leave a comment:
inesdesantiago replied

06-13-2009, 07:24 AM
Bowtie BWT indexing

Thanks Ben!
I see that the BWT-based indexing of the reference genome is a great advantage. It allows Bowtie to do its searches with very small memory footprint. But does it mean that, because it uses less memory to index the reference genome, it will be faster? Is less memory == Fast Search?
Ines

Last edited by inesdesantiago; 06-13-2009, 07:26 AM. Reason: typo
Leave a comment:
Ben Langmead replied

06-12-2009, 07:27 PM
Hi Ines,

The Bowtie paper has details about the algorithm. You can find more visual discussions in the slides linked to from the Bowtie website (see Other Documentation section in the right-hand sidebar).

Thanks,
Ben
Leave a comment:
inesdesantiago replied

06-12-2009, 04:44 PM
Why is Bowtie Fast?

I am very impressed with Bowtie!
It is mega-ultra-fast, and runs on my [windows] laptop!

Does anyone knows why it is so fast? Comparing with Eland and MAQ which do exactly the same?
These informatic 'tricks' are everything that we need to handle such ammount of data.
I would like to apply the principles of bowtie to my own scripts, but have no idea what makes it so fast!

Any comments?
Thanks
Ines de Santiago

Last edited by inesdesantiago; 06-12-2009, 04:46 PM. Reason: typo
Leave a comment:
Ben Langmead replied

06-10-2009, 01:05 PM
Originally posted by polsum View Post

hey Ben, another question. When I try to execute "/bowtie-0.9.9.3/bowtie e_coli reads/e_coli_1000.fq" in my Mac, I get a response like this: "Warning: Could not open file "reads/e_coli_1000.fq" for reading". What could be the reason for this? I downloaded "bowtie-0.9.9.3-bin-macos-10.5-i386.zip" and my mac is OSX10.5.6 with intel.

thanks in advance.

Hi polsum,

Does the "reads/e_coli_1000.fq" file exist, relative to your current working directory when you issue that command?

Ben
Leave a comment:
Ben Langmead replied

06-10-2009, 01:04 PM
Originally posted by -daf- View Post

Sorry for the inconvenience, i have achieved success with linux ftp command

Hi daf,

I've heard that complaint from others as well. I think that the unzip programs on some platforms (e.g Mac) cannot necessarily handle extracting > 2 GB archives. I went ahead and split the large archives into 2 each. See Bowtie page for changes.

Thanks,
Ben
Leave a comment:
-daf- replied

06-10-2009, 04:03 AM
Originally posted by -daf- View Post

Hello, thanks for bowtie
I've problem with downloading bowtie index for human genome from ftp://ftp.cbcb.umd.edu/pub/data/bowt...s_asm.ebwt.zip. I have no problem with smaller indexes such as g_gallus.ebwt.zip.
Is it possible to split file for downloading?

Sorry for the inconvenience, i have achieved success with linux ftp command
Leave a comment:
chuck replied

06-10-2009, 01:04 AM
PET as SET

Originally posted by Ben Langmead View Post

Given that, would you rather Bowtie rejected your 1-bp reads in paired-end mode (as it currently does in unpaired mode), or would you rather Bowtie accepted (but skipped) your 1-bp reads in unpaired mode? My feeling is that Bowtie should at least print a warning by default in both cases, since 1-bp reads are usually a sign that something went wrong upstream of the aligner. If there's a good reason why 1-bp reads should be tolerated, then maybe Bowtie should also provide a command-line option that suppresses the warning in cases where the user would like to tolerate it.

Ben

Ben, thanks for the reply. I agree with you - no, there is no compelling reason that 1 bp reads should be accepted. They do not add anything to the alignment of these short reads but it would be useful if they were just skipped and a warning was printed. Currently, the alignment fails completely.

Oh, one more thing I forgot to mention, when I converted the PET files to a 'raw' format, I actually changed all of the "." in the original fa file with "N" - this might also be the reason it worked, if bowtie counts the Ns as a base, just an unknown one, but the . is a missing position.

Thanks again!

Chuck
Leave a comment:
polsum replied

06-09-2009, 11:35 AM
hey Ben, another question. When I try to execute "/bowtie-0.9.9.3/bowtie e_coli reads/e_coli_1000.fq" in my Mac, I get a response like this: "Warning: Could not open file "reads/e_coli_1000.fq" for reading". What could be the reason for this? I downloaded "bowtie-0.9.9.3-bin-macos-10.5-i386.zip" and my mac is OSX10.5.6 with intel.

thanks in advance.
Leave a comment:
polsum replied

06-09-2009, 10:32 AM
Originally posted by Ben Langmead View Post

For now, the way to do that is via options like -k/-a/--nostrata/-m. You can count the number of alignments from the output bowtie generates.

Bowtie aligns the entire read with a certain number of mismatches.

Bowtie's job is to find legal alignments subject to the constraints imposed by the alignment and reporting policies specified by the user (see manual for info about -k/-m/-a/--nostrata, etc). Any additional filtering you might want to perform will have to be done externally, say, in a script.

No - you'll have to do vector trimming ahead of time.

Hope that helps,
Ben

Thanks a lot for the replies.
Leave a comment:

Previous 1 16 23 24 25 26 27 28 29 34 template Next

Recent Advances in Sequencing Analysis Tools

by seqadmin

The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
- Channel: Articles
05-06-2024, 07:48 AM
Essential Discoveries and Tools in Epitranscriptomics

by seqadmin

The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
- Channel: Articles
04-22-2024, 07:01 AM

Topics	Statistics	Last Post
A Closer Look at the Enigmatic Genomes of Oikopleura dioica by seqadmin Started by seqadmin, 05-10-2024, 06:35 AM	0 responses 15 views 0 likes	Last Post by seqadmin 05-10-2024, 06:35 AM
Advanced Epigenome Editing Platform Explores Gene Regulation Mechanisms by seqadmin Started by seqadmin, 05-09-2024, 02:46 PM	0 responses 21 views 0 likes	Last Post by seqadmin 05-09-2024, 02:46 PM
Telomere Maintenance by PARP1: A New Perspective in Cancer Research by seqadmin Started by seqadmin, 05-07-2024, 06:57 AM	0 responses 18 views 0 likes	Last Post by seqadmin 05-07-2024, 06:57 AM
Enhanced Neoantigen Detection: Introducing NeoHunter by seqadmin Started by seqadmin, 05-06-2024, 07:17 AM	0 responses 19 views 0 likes	Last Post by seqadmin 05-06-2024, 07:17 AM

Seqanswers Leaderboard Ad

Announcement

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Latest Articles

ad_right_rmr

News