Hello all,
If you work with large genomes and large sets of short reads, please
take a look at Bowtie (http://bowtie-bio.sf.net), a new open source
short read aligner written by myself and Cole Trapnell at the
University of Maryland. Bowtie is an ultrafast, memory-efficient short
read aligner. It aligns short reads to the human genome at a rate of 25
million reads per hour on a typical workstation with 2 gigabytes of
memory. Bowtie indexes the genome with a Burrows-Wheeler index to keep
its memory footprint small: about 1.3 GB for the human genome. It
supports alignment policies equivalent to Maq and SOAP, but at much
greater speeds.
As a denizen of these forums, you probably appreciate that there are
now many, many short read aligners to choose from. Our goal with
Bowtie was to exploit an algorithmic "sweet spot" to bring ultrafast
read alignment to typical desktop computers. These days, a typical
desktop has 2 or 4 gigabytes of RAM and multiple (2 or 4) processor
cores. I recently used Bowtie on my own 4-core, 2 GB desktop to align
14.3x coverage worth of Illumina/Solexa reads from the 1000-Genomes
project to the human genome in a single overnight (14 hours). This is
significantly faster than both Eland and ZOOM, and makes it much easier
and faster to extract biological evidence from these huge datasets.
Here is a brief feature list, but if you are interested then please
check our site regularly because Bowtie is actively being developed and
maintained:
As mentioned in the "Software packages for next gen sequence analysis"
thread, Bowtie does not yet support paired-end alignment or indels.
Both features are very much on our to-do list, though, so please keep
an eye out new versions over the coming months.
Thanks very much!
Ben Langmead
If you work with large genomes and large sets of short reads, please
take a look at Bowtie (http://bowtie-bio.sf.net), a new open source
short read aligner written by myself and Cole Trapnell at the
University of Maryland. Bowtie is an ultrafast, memory-efficient short
read aligner. It aligns short reads to the human genome at a rate of 25
million reads per hour on a typical workstation with 2 gigabytes of
memory. Bowtie indexes the genome with a Burrows-Wheeler index to keep
its memory footprint small: about 1.3 GB for the human genome. It
supports alignment policies equivalent to Maq and SOAP, but at much
greater speeds.
As a denizen of these forums, you probably appreciate that there are
now many, many short read aligners to choose from. Our goal with
Bowtie was to exploit an algorithmic "sweet spot" to bring ultrafast
read alignment to typical desktop computers. These days, a typical
desktop has 2 or 4 gigabytes of RAM and multiple (2 or 4) processor
cores. I recently used Bowtie on my own 4-core, 2 GB desktop to align
14.3x coverage worth of Illumina/Solexa reads from the 1000-Genomes
project to the human genome in a single overnight (14 hours). This is
significantly faster than both Eland and ZOOM, and makes it much easier
and faster to extract biological evidence from these huge datasets.
Here is a brief feature list, but if you are interested then please
check our site regularly because Bowtie is actively being developed and
maintained:
- Extremely fast!
- Specify any number of parallel search threads with -p (uses pthreads) to exploit multiple processor cores
- Small index: for human, memory footprint is ~1.3GB (with -z option), size on disk is ~2.2GB
- Pre-built indexes available from website: http://bowtie-bio.sf.net
- Human, chimp, dog, mouse, rat, chicken, a. thaliana, fruitfly, etc.
- Input formats: FASTA, FASTQ, FASTQ w/ Solexa quals, raw, command-line
- Includes tool to convert Bowtie output to a Maq .map file so that you can use Bowtie's output with, e.g., 'maq assemble' and 'maq cns2cnp'
- Use -n option to activate a Maq-like policy
- N (set with -n) mismatches allowed in first L (set with -l) bases
- Sum of quality values at mismatched positions may not exceed E (set with -e)
- Use -v option to activate a SOAP-like policy
- V (set with -v) mismatches allowed in the whole alignment
- Quality values are ignored
- Flexible reporting:
- Use -k to report K alignments
- Use -a to report all alignments
- Use --best to guarantee that the alignment(s) reported are "best" in terms of # of mismatches
- These come at a cost to speed! See manual for details.
As mentioned in the "Software packages for next gen sequence analysis"
thread, Bowtie does not yet support paired-end alignment or indels.
Both features are very much on our to-do list, though, so please keep
an eye out new versions over the coming months.
Thanks very much!
Ben Langmead
Comment