Seqanswers Leaderboard Ad

**Wind** · 08-03-2009, 02:17 AM

Bowtie is a nice tool for short read alignment I think. However, I found a problem in pair-end data mapping. I produced 75bp reads by simulating Illumina's high-throughput sequencing, and aligned them to the reference sequence. By the way, only few alignments, less than 10, are reported. As 1300000 alignments are reported with non paired-end mapping, probably it is wrongly mapped I think.
My option is "bowtie -p 8 -a -y -X 650 human -1 reads_1.fa -2 reads_2.fa output.map".

Can anybody tell me what is the problem?

**Ben Langmead** · 08-03-2009, 05:12 AM

Originally posted by Wind View Post

Bowtie is a nice tool for short read alignment I think. However, I found a problem in pair-end data mapping. I produced 75bp reads by simulating Illumina's high-throughput sequencing, and aligned them to the reference sequence. By the way, only few alignments, less than 10, are reported. As 1300000 alignments are reported with non paired-end mapping, probably it is wrongly mapped I think.
My option is "bowtie -p 8 -a -y -X 650 human -1 reads_1.fa -2 reads_2.fa output.map".

This is probably due to the -I/--minins, -X/--maxins, and/or --fr/--rf/--ff options being set incorrectly. Please double-check the manual's description of those options and verify that your invocation matches the way you've simulated your reads. Also, make sure the simulated read files are formatted correctly, with all mates lining up properly.

Thanks,
Ben

**Wind** · 08-03-2009, 05:26 PM

Thanks

Hi Ben,

Thanks for your advice. There were many 'N's in simulated data, so that they may interrupt paired-mapping. I'll try with other data sets. Thanks.

**tianell** · 08-04-2009, 11:32 PM

Ben, help me..

Hi Ben,
I have a question for you about alignment result message.
When I align certain short reads to reference using Bowtie, can I get a result message related to none-matched case??

I could not find an option to get a such result message.

I want to report even if certain short reads are not aligned to reference in order to use this information(not aligned!).

I wil wait your answer, Ben. Thank you so much.

**Ben Langmead** · 08-05-2009, 07:15 AM

Hi tianell,

Originally posted by tianell View Post

When I align certain short reads to reference using Bowtie, can I get a result message related to none-matched case??

I could not find an option to get a such result message.

I want to report even if certain short reads are not aligned to reference in order to use this information(not aligned!).

Sorry, no, there is no option to print such a message. I'll add this as a feature request. In the meantime, it's quite easy to deduce that number either by using the --un/--max options (and then counting), or by subtracting the reported number from the number of input reads.

Thanks,
Ben

**joa_ds** · 08-05-2009, 07:17 AM

Isn't there a feature to export unmapped reads to a file?

I always run bowtie and export unmapped and repeats using

--unfq unaligned.fastq --maxfa duplicates.fastq

taking a look at the size of both files compared to your original file gives you an approx idea of % unaligned/repeats

**bioinfosm** · 08-06-2009, 11:32 AM

I wanted to discuss a use-case:
A collection of 172 million reads ranging from 36 to 76 base long was used with bowtie to map to a reference.

$ ./bowtie --best --un leftover -p 4 -t reference reads mapped
$ grep -c '^@' leftover
154828705
$ wc -l mapped
16269083 mapped

The total of leftover and mapped is less than what we started with. Are the remaining reads mapping to multiple locations, and thus omitted in both these files?

**Ben Langmead** · 08-06-2009, 11:52 AM

Hi boinfosm,

Originally posted by bioinfosm View Post

The total of leftover and mapped is less than what we started with. Are the remaining reads mapping to multiple locations, and thus omitted in both these files?

That shouldn't be the case. When only --un is used (as opposed to both --un and --max), both the unaligned reads and the reads with a number of alignments exceeding the -m limit will go into the --un file. But you're not using the -m option, so no reads should be suppressed due to multiple alignments.

How are you counting the number of reads in your input set? Note that grep -c '^@' isn't necessarily correct because quality strings can also start with @.

Thanks,
Ben

**bioinfosm** · 08-07-2009, 11:33 AM

thanks Ben.. the light bulb just flashed on me!

**davisc** · 08-19-2009, 08:54 AM

Question about RepeatMasked hg18 index

I'm doing RNA-Seq on human samples. In many instances I am mapping using the -m1 -v2 --best criteria to the preassembled hg18.asm index available on the download site. I would like to know how Bowtie handles N's in the indices? I am wondering if it is possible to cut down the mapping time by building and mapping against a repeatmasked version of the genome?

**Ben Langmead** · 08-19-2009, 09:05 AM

Originally posted by davisc View Post

I would like to know how Bowtie handles N's in the indices? I am wondering if it is possible to cut down the mapping time by building and mapping against a repeatmasked version of the genome?

When Bowtie indexes the reference, it elides non-A/C/G/T characters. So if you index a reference with stretches of Ns, Bowtie will never report an alignment spanning any of the stretches.

And yes, mapping against the repeatmasked version of the genome (and omitting -m 1) ought to be noticeably faster.

Ben

**ewilbanks** · 09-08-2009, 02:28 PM

Indexing human genome?

Hi!

I'm working on building an index of human genome locally and I was wondering how long this usually takes? Its been running for about 3 hrs, just wondering what to expect. I'm on a MAC dual core with 4GB ram.

Thanks!
Lizzy

**Ben Langmead** · 09-09-2009, 03:41 AM

Hi Lizzy,

I'd expect, oh, about 7-8 hours or so. Did it finish?

Thanks,
Ben

**Layla** · 09-09-2009, 07:44 AM

Im a newbie to Bowtie....tired of the counting down the hours using MAQ.

Currently building an index using Bowtie. What is the difference between
h_sapiens_asm.ebwt.zip and
h_sapiens.ebwt.zip

Thanks

L

**Ben Langmead** · 09-09-2009, 07:48 AM

Hi Layla,

h_sapiens indexes the NCBI human reference contigs and h_sapiens_asm indexes the NCBI human reference assembly. Take a look at the scripts/make_h_sapiens.sh and scripts/make_h_sapiens_asm.sh files distributed with Bowtie to see exactly what fasta files were indexed and how.

People often prefer the assembly because the coordinates output by bowtie are more immediately useful (e.g., they correspond to the hg18 coordinates in the Genome Browser).

Thanks,
Ben

Topics	Statistics	Last Post
New Model Aims to Explain Polygenic Diseases by Connecting Genomic Mutations and Regulatory Networks by seqadmin Started by seqadmin, Yesterday, 05:31 AM	0 responses 10 views 0 likes	Last Post by seqadmin Yesterday, 05:31 AM
Small Blood Stem Cell Subset Linked to Immune System Aging by seqadmin Started by seqadmin, 10-24-2024, 06:58 AM	0 responses 20 views 0 likes	Last Post by seqadmin 10-24-2024, 06:58 AM
New AI Model Designs Synthetic DNA Switches for Targeted Gene Expression in Specific Cell Types by seqadmin Started by seqadmin, 10-23-2024, 08:43 AM	0 responses 50 views 0 likes	Last Post by seqadmin 10-23-2024, 08:43 AM
Microbes in Urban Spaces Adapt to Disinfectants and Scarce Resources by seqadmin Started by seqadmin, 10-17-2024, 07:29 AM	0 responses 58 views 0 likes	Last Post by seqadmin 10-17-2024, 07:29 AM

Seqanswers Leaderboard Ad

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News