Bowtie, an ultrafast, memory-efficient, open source short read aligner

-daf- replied

06-09-2009, 03:33 AM
Hello, thanks for bowtie
I've problem with downloading bowtie index for human genome from ftp://ftp.cbcb.umd.edu/pub/data/bowt...s_asm.ebwt.zip. I have no problem with smaller indexes such as g_gallus.ebwt.zip.
Is it possible to split file for downloading?
Leave a comment:
Ben Langmead replied

06-08-2009, 06:48 PM
Hi Chuck,

OK - so you do have 1-bp reads. That explains the error in unpaired mode. Given that, would you rather Bowtie rejected your 1-bp reads in paired-end mode (as it currently does in unpaired mode), or would you rather Bowtie accepted (but skipped) your 1-bp reads in unpaired mode? My feeling is that Bowtie should at least print a warning by default in both cases, since 1-bp reads are usually a sign that something went wrong upstream of the aligner. If there's a good reason why 1-bp reads should be tolerated, then maybe Bowtie should also provide a command-line option that suppresses the warning in cases where the user would like to tolerate it.

Ben
Leave a comment:
chuck replied

06-06-2009, 02:24 PM
PET as SET

Hi Ben,

I've tried this for a number of different files and the result is always the same.

Yes, there are reads that only have a single base but in PET mode, it skips them. There is a long list of errors as it rejects short reads but it does the alignment job.

In singles mode, it seems to hit the first error and quit.

Perhaps that is the difference? How it deals with the error?

What's the best way to send them to you? I guess I could just take the first few thousand reads of each pair along with a reference? That should do it and avoid sending massive data files.

Chuck
Leave a comment:
Ben Langmead replied

06-06-2009, 04:24 AM
Hi Chuck,

When running in unpaired mode, Bowtie doesn't try to detect whether a file is part of a pair or not. It simply treats it as a plain-old unpaired fasta file. Have you checked to see whether any of the mates really are 1-bp in that file? Are there any other peculiarities in how that file is formatted?

If neither of those are the issue, could you let me borrow that file so I can try to diagnose myself?

Thanks,
Ben
Leave a comment:
chuck replied

06-06-2009, 02:53 AM
more on using PET as SET files in bowtie

Hi - I just stripped all of the >tags off the reads and used one of the PET pairs as a -r raw file and it works fine...

so, I guess that bowtie is detecting that the data is supposed to be PET from the >tag info?
Leave a comment:
chuck replied

06-06-2009, 02:08 AM
Using PET files as SET files in bowtie

Hello - thanks for bowtie - I like it and the output is handy for me to analyse.

I have a bit of odd behavior to report that I can't understand or figure out. I have lots of little contigs (100-1000 bp) that I am aligning against and I have both SET and PET files.

When I align the SET against the short contigs, everything works great. <example command follows>

./bowtie -f shortcontigs_index lane1.fa lane1vreference.map

When I align both files for the PET data, everything works great but obviously my results are strongly biased towards those pairs which are very close together and many of the alignments are rejected because one of the pairs is sticking out into 'space'...

./bowtie -f shortcontigs_index -1 lane1_1.fa -2 lane1_2.fa lane1vreference.map

When I try to use one of the PET files as a singles file, bowtie runs for just a second, usually reporting that one of my reads is less than 2 base pairs long and then quits.

./bowtie -f shortcontigs_index lane1_1.fa lane1vreference.map

Does bowtie somehow detect that the original file is a PET file and will not let me run it by itself?
Leave a comment:
Ben Langmead replied

06-04-2009, 10:31 AM
Originally posted by polsum View Post

1. Is there a way to know how many times a particular sequence is mapped to the reference genome?

For now, the way to do that is via options like -k/-a/--nostrata/-m. You can count the number of alignments from the output bowtie generates.

Originally posted by polsum View Post

2. How do I specify the minimum length of matching? FOr example I want only >20 nt mapping of input sequences to the reference genome. Is there a way to specify that number?

Bowtie aligns the entire read with a certain number of mismatches.

Originally posted by polsum View Post

3. How do I control the quality of mapping? For example, How do I eliminate a match of a sequence such as TAAAAAAAAAAAAAAAAAAAAGC to the reference genome?, because it is not a specific match.

Bowtie's job is to find legal alignments subject to the constraints imposed by the alignment and reporting policies specified by the user (see manual for info about -k/-m/-a/--nostrata, etc). Any additional filtering you might want to perform will have to be done externally, say, in a script.

Originally posted by polsum View Post

4. Finally, IS there a way to trim solexa adapters from the input sequences?

No - you'll have to do vector trimming ahead of time.

Hope that helps,
Ben
Leave a comment:
polsum replied

06-04-2009, 10:20 AM
Hi, In bowtie output,

1. Is there a way to know how many times a particular sequence is mapped to the reference genome?

2. How do I specify the minimum length of matching? FOr example I want only >20 nt mapping of input sequences to the reference genome. Is there a way to specify that number?

3. How do I control the quality of mapping? For example, How do I eliminate a match of a sequence such as TAAAAAAAAAAAAAAAAAAAAGC to the reference genome?, because it is not a specific match.

4. Finally, IS there a way to trim solexa adapters from the input sequences?

I am newbie in this field , so please pardon if the questions seem stupid.

Many thanks in advance.
Leave a comment:
Ben Langmead replied

06-04-2009, 06:29 AM
No, we have not written such a converter. If you write one and think that others may benefit from it (and don't mind sharing it), perhaps we can include it in a future release.

Thanks,
Ben
Leave a comment:
Arno replied

06-04-2009, 06:23 AM
I'm new to next-gen sequencing and have started playing around with different alignment tools. I have used Bowtie and it is works very fine. I have used M musculus pre-index database. I have a quick question about the output file results particularly the chromosome location. As, you used the NCBI genome database; we have gi accession (gi|149233633|ref|NT_039169.7|Mm1_39209_37) for the chromosome location instead of the chromosome number (chr1...). Do you have any tools or option to make the mapping between them or I have to write my own tools.

Thanks

Arnaud.
Leave a comment:
Ben Langmead replied

06-02-2009, 07:21 AM
Re: insertions and deletions: no support yet. It's on the TODO list. We'll probably tackle it this summer.

Thanks,
Ben
Leave a comment:
Richard Finney replied

06-02-2009, 07:18 AM
I need some splaining (prolly cuz I don't understand all the inside baseball terms).

Does bowtie detect short insertions? 1bp? 2bp? what limits are there?

Does bowtie detect short deletions? 1bp deletion, 2bp? etc.?

thanks.
Leave a comment:
iaaa99 replied

05-28-2009, 09:00 AM
I want to ask how I can obtain same alignment file from Bowtie. That is equivalent to ELAND options. I tried -v 2 -l 32 which means maximum 2 mismatch in the first 32 seed which are the default parameters in ELAND and I am still getting more alignment reads in Bowtie by 20 %
Leave a comment:
ShaunMahony replied

05-19-2009, 07:52 AM
Hi Ben,
Here's one, but I can send you a whole file if you like:

>Test:chr5:15656372:15656404
CTGAGCAAGGGGACCCCAATGGAAAAGTTAGG

This is aligned uniquely (and correctly) by most aligners, but is not aligned by Bowtie with the above arguments. I just noticed that when I remove the "-m 2" option, this read is aligned uniquely. This is counter-intuitive.

What arguments do you recommend if I just want to report the unique alignments? I have been using -m 2.
Leave a comment:
Ben Langmead replied

05-19-2009, 07:35 AM
Hi Shaun,

No, Bowtie does not filter on the basis of quality by default. Can you pick an example 32-mer that you think should align but that doesn't? There are a few possibilities for why it's happening.

THanks,
Ben
Leave a comment:

Previous 1 17 24 25 26 27 28 29 30 34 template Next

Recent Advances in Sequencing Analysis Tools

by seqadmin

The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
- Channel: Articles
05-06-2024, 07:48 AM
Essential Discoveries and Tools in Epitranscriptomics

by seqadmin

The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
- Channel: Articles
04-22-2024, 07:01 AM

Topics	Statistics	Last Post
A Closer Look at the Enigmatic Genomes of Oikopleura dioica by seqadmin Started by seqadmin, Yesterday, 06:35 AM	0 responses 14 views 0 likes	Last Post by seqadmin Yesterday, 06:35 AM
Advanced Epigenome Editing Platform Explores Gene Regulation Mechanisms by seqadmin Started by seqadmin, 05-09-2024, 02:46 PM	0 responses 18 views 0 likes	Last Post by seqadmin 05-09-2024, 02:46 PM
Telomere Maintenance by PARP1: A New Perspective in Cancer Research by seqadmin Started by seqadmin, 05-07-2024, 06:57 AM	0 responses 17 views 0 likes	Last Post by seqadmin 05-07-2024, 06:57 AM
Enhanced Neoantigen Detection: Introducing NeoHunter by seqadmin Started by seqadmin, 05-06-2024, 07:17 AM	0 responses 19 views 0 likes	Last Post by seqadmin 05-06-2024, 07:17 AM

Seqanswers Leaderboard Ad

Announcement

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Latest Articles

ad_right_rmr

News