Seqanswers Leaderboard Ad

**Ben Langmead** · 05-07-2009, 07:04 AM

Hi dara,

Yes, you have to build separate index files and query them separately. You'll have to synthesize the per-index results into an overall set of results, e.g., with some scripts. Bowtie doesn't currently know how to query multiple indexes as part of a single alignment run.

Thanks,
Ben

**Ben Langmead** · 05-07-2009, 07:06 AM

Now that paired-end is substantially done, we'll be embarking on gapped alignment soon. I'll probably start on that in June. Hopefully by the end of the summer you'll see at least initial gapped-alignment support. That's a guess though

.

Thanks,
Ben

Originally posted by dara View Post

Also another question for you:

Any updates on plans for bowtie supporting gapped alignment?

thanks

**dara** · 05-07-2009, 07:27 AM

Hello Ben,

Thank you for your quick response. However, I'm a little puzzled because I was looking at the script that comes along with genome index on the Bowtie website (make_h_sapiens_asm.sh) and it seems to build just one index by providing all the chunks to the bowtie-build executable at once. Here's the line I'm talking about:

INPUTS=hs_ref_chr1.fa,hs_ref_chr2.fa,hs_ref_chr3.fa,hs_ref_chr4.fa,hs_ref_chr5.fa,hs_ref_chr6.fa,hs_ref_chr7.fa,hs_ref_chr8.fa,hs_ref_chr9.fa,hs_ref_chr10.fa,hs_ref_chr11.fa,hs_ref_chr12.fa,hs_ref_chr13.fa,hs_ref_chr14.fa,hs_ref_chr15.fa,hs_ref_chr16.fa,hs_ref_chr17.fa,hs_ref_chr18.fa,hs_ref_chr19.fa,hs_ref_chr20.fa,hs_ref_chr21.fa,hs_ref_chr22.fa,hs_ref_chrMT.fa,hs_ref_chrX.fa,hs_ref_chrY.fa

${BOWTIE_BUILD_EXE} ${INPUTS} h_sapiens_asm

I was trying the same thing- providing individual chromosome splits to the indexer and it complained.

Thanks again

**Ben Langmead** · 05-07-2009, 07:35 AM

Hi dara,

It complained that the total sequence length of all the reference strings was too big to fit in a single index, right? I didn't mean to imply that you can't feed multiple fasta files to bowtie-build; you certainly can. But if the total total length of all the sequence you're supplying is too big, you'll have to break the input up into chunks somehow and build separate indexes for each chunk. You might try feeding the fasta files in smaller bundles, or you might redistribute sequences throughout the fasta files, or both. If you've got chromosomes, you probably just want to try bundling together as many chromosome fasta files as you can get away with in a single invocation of bowtie-build.

Does that make sense?

Thanks,
Ben

**dara** · 05-07-2009, 07:38 AM

yes that makes sense. Thank you

**ShaunMahony** · 05-19-2009, 07:28 AM

This has probably been answered already, so apologies in advance.

Does anyone know if Bowtie by default filters the input on the basis of quality? I'm getting a strange result. When I perfectly sample random 32mers from the mouse genome, and then align them back to the same genome, most aligners align ~83% uniquely. However, Bowtie is only aligning ~77%.

Where are the missing reads going? It can't be mismatch qualities, since there are no mismatches in the sampled 'reads'. These are the options I'm using:

./bowtie -q --solexa-quals -m 2 --best -p 2

**Ben Langmead** · 05-19-2009, 07:35 AM

Hi Shaun,

No, Bowtie does not filter on the basis of quality by default. Can you pick an example 32-mer that you think should align but that doesn't? There are a few possibilities for why it's happening.

THanks,
Ben

**ShaunMahony** · 05-19-2009, 07:52 AM

Hi Ben,
Here's one, but I can send you a whole file if you like:

>Test:chr5:15656372:15656404
CTGAGCAAGGGGACCCCAATGGAAAAGTTAGG

This is aligned uniquely (and correctly) by most aligners, but is not aligned by Bowtie with the above arguments. I just noticed that when I remove the "-m 2" option, this read is aligned uniquely. This is counter-intuitive.

What arguments do you recommend if I just want to report the unique alignments? I have been using -m 2.

**iaaa99** · 05-28-2009, 09:00 AM

I want to ask how I can obtain same alignment file from Bowtie. That is equivalent to ELAND options. I tried -v 2 -l 32 which means maximum 2 mismatch in the first 32 seed which are the default parameters in ELAND and I am still getting more alignment reads in Bowtie by 20 %

**Richard Finney** · 06-02-2009, 07:18 AM

I need some splaining (prolly cuz I don't understand all the inside baseball terms).

Does bowtie detect short insertions? 1bp? 2bp? what limits are there?

Does bowtie detect short deletions? 1bp deletion, 2bp? etc.?

thanks.

**Ben Langmead** · 06-02-2009, 07:21 AM

Re: insertions and deletions: no support yet. It's on the TODO list. We'll probably tackle it this summer.

Thanks,
Ben

**Arno** · 06-04-2009, 06:23 AM

I'm new to next-gen sequencing and have started playing around with different alignment tools. I have used Bowtie and it is works very fine. I have used M musculus pre-index database. I have a quick question about the output file results particularly the chromosome location. As, you used the NCBI genome database; we have gi accession (gi|149233633|ref|NT_039169.7|Mm1_39209_37) for the chromosome location instead of the chromosome number (chr1...). Do you have any tools or option to make the mapping between them or I have to write my own tools.

Thanks

Arnaud.

**Ben Langmead** · 06-04-2009, 06:29 AM

No, we have not written such a converter. If you write one and think that others may benefit from it (and don't mind sharing it), perhaps we can include it in a future release.

Thanks,
Ben

**polsum** · 06-04-2009, 10:20 AM

Hi, In bowtie output,

1. Is there a way to know how many times a particular sequence is mapped to the reference genome?

2. How do I specify the minimum length of matching? FOr example I want only >20 nt mapping of input sequences to the reference genome. Is there a way to specify that number?

3. How do I control the quality of mapping? For example, How do I eliminate a match of a sequence such as TAAAAAAAAAAAAAAAAAAAAGC to the reference genome?, because it is not a specific match.

4. Finally, IS there a way to trim solexa adapters from the input sequences?

I am newbie in this field , so please pardon if the questions seem stupid.

Many thanks in advance.

**Ben Langmead** · 06-04-2009, 10:31 AM

Originally posted by polsum View Post

1. Is there a way to know how many times a particular sequence is mapped to the reference genome?

For now, the way to do that is via options like -k/-a/--nostrata/-m. You can count the number of alignments from the output bowtie generates.

Originally posted by polsum View Post

2. How do I specify the minimum length of matching? FOr example I want only >20 nt mapping of input sequences to the reference genome. Is there a way to specify that number?

Bowtie aligns the entire read with a certain number of mismatches.

Originally posted by polsum View Post

3. How do I control the quality of mapping? For example, How do I eliminate a match of a sequence such as TAAAAAAAAAAAAAAAAAAAAGC to the reference genome?, because it is not a specific match.

Bowtie's job is to find legal alignments subject to the constraints imposed by the alignment and reporting policies specified by the user (see manual for info about -k/-m/-a/--nostrata, etc). Any additional filtering you might want to perform will have to be done externally, say, in a script.

Originally posted by polsum View Post

4. Finally, IS there a way to trim solexa adapters from the input sequences?

No - you'll have to do vector trimming ahead of time.

Hope that helps,
Ben

Topics	Statistics	Last Post
Expanding the Horizons of Cellular Research with the Single Cell Atlas by seqadmin Started by seqadmin, 04-25-2024, 11:49 AM	0 responses 19 views 0 likes	Last Post by seqadmin 04-25-2024, 11:49 AM
Genetic Variants and Diabetes Risk in Childhood Cancer Survivors by seqadmin Started by seqadmin, 04-24-2024, 08:47 AM	0 responses 17 views 0 likes	Last Post by seqadmin 04-24-2024, 08:47 AM
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 62 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM

Seqanswers Leaderboard Ad

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News