Seqanswers Leaderboard Ad

**chuck** · 07-06-2009, 07:55 PM

Ben,

I tried bowtie remade with extraflags but it just did the same thing. Would there be a log file somewhere or something in the map file? I can't seem to find any additional output.

Chuck

**seq_GA** · 07-08-2009, 01:35 AM

How to build index for human genome? Do we need to add individual chrmosomes one by one with the same index name. Pretty confused about this step.
After building index, I have to start using bowtie aligner like ./bowtie .. with parameter rite?
Please clarify about buinding different chrmosomes of hg18.
Thanks.

**Ben Langmead** · 07-08-2009, 05:22 AM

Originally posted by chuck View Post

I tried bowtie remade with extraflags but it just did the same thing. Would there be a log file somewhere or something in the map file? I can't seem to find any additional output.

Chuck - I turned this into a sourceforge issue so that we can keep all relevant info in one place and not clutter the forum too much:

Bowtie / Bugs / #37 Hanging bug

https://sourceforge.net/tracker/?func=detail&aid=2818544&group_id=236897&atid=1101606

I'll keep looking at this. Thanks for the details.

Ben

**Ben Langmead** · 07-08-2009, 05:25 AM

Originally posted by seq_GA View Post

How to build index for human genome? Do we need to add individual chrmosomes one by one with the same index name. Pretty confused about this step.

You can specify a comma-separated list of FASTA files as the input to bowtie-build. Example scripts that do this automatically (including the download step) are included in the 'scripts' subdirectory of the Bowtie package. E.g. scripts/make_h_sapiens_asm.sh

Alternately, you can download a pre-built index from the Bowtie website.

Originally posted by seq_GA View Post

After building index, I have to start using bowtie aligner like ./bowtie .. with parameter rite?.

Yes, that's right.

Ben

**bioinfosm** · 07-08-2009, 12:42 PM

Shaun or Ben,

Did you guys get around this?

Originally posted by ShaunMahony View Post

Hi Ben,
Here's one, but I can send you a whole file if you like:

>Test:chr5:15656372:15656404
CTGAGCAAGGGGACCCCAATGGAAAAGTTAGG

This is aligned uniquely (and correctly) by most aligners, but is not aligned by Bowtie with the above arguments. I just noticed that when I remove the "-m 2" option, this read is aligned uniquely. This is counter-intuitive.

What arguments do you recommend if I just want to report the unique alignments? I have been using -m 2.

**Ben Langmead** · 07-08-2009, 01:08 PM

Hi,

Originally posted by bioinfosm View Post

Shaun or Ben,

Did you guys get around this?

Shaun also wrote an email at the time, which I responded to. I should have copied it here but didn't. Here are the salient bits, updated to be relevant to the changes made in 0.10.0:

What arguments do you recommend if I just want to report the unique alignments? I have been using -m 2.

Why -m 2 instead of -m 1?

I don't know myself why I've been using -m 2 instead of -m 1. I must have
assumed at some stage that -m counted greater than or equal to.

What definition of "unique" are you after? Is it (a) there are no other legal alignments period, or (b) there are no other legal alignments with the same number of mismatches as the best match? If (b), use --strata --best -m 1, rather than just -m 1.

Is -k X guaranteed to report the lowest mismatch alignments first?

Answer: yes, -k X --best will report the "best" alignments first.

Ben

**bioinfosm** · 07-08-2009, 01:18 PM

thanks Ben ..

**seq_GA** · 07-08-2009, 10:33 PM

Thanks Ben.

**seq_GA** · 07-09-2009, 05:35 PM

Hi Ben,

I get to see different output from the following examples. Please let me know whether I am intrepretting correctly.

Code:

./bowtie -a --best -v 2 ../Genome/hg18/hg18 --concise -c gtctggcggcggcctggcggagcg
1+:<21,21852845,0>
Reported 1 alignments to 1 output stream(s)
[]$ ./bowtie -a --best -v 2 ../Genome/hg18/hg18 -c gtctggcggcggcctggcggagcg -p 5
0  +  chr22 21852845    GTCTGGCGGCGGCCTGGCGGAGCG        IIIIIIIIIIIIIIIIIIIIIIII 0
Reported 1 alignments to 1 output stream(s)


[]$ ./bowtie -a --best -v 2 ../Genome/hg18/hg18 --concise -c gaccaacttgttcagcgccttgat -p 5
1+:<5,132749285,0>
Reported 1 alignments to 1 output stream(s)
[]$ ./bowtie -a --best -v 2 ../Genome/hg18/hg18 -c gaccaacttgttcagcgccttgat -p 5
0  +  chr9  132749285   GACCAACTTGTTCAGCGCCTTGAT        IIIIIIIIIIIIIIIIIIIIIIII 0
Reported 1 alignments to 1 output stream(s)

In both the above example, I tried using --concise as well as complete output format. And for the same sequence, even though it reports the same coordinates, ref_idx seems to be different for both the outputs.

Please let me know.

**seq_GA** · 07-09-2009, 07:24 PM

Originally posted by Ben Langmead View Post

Hi,

Shaun also wrote an email at the time, which I responded to. I should have copied it here but didn't. Here are the salient bits, updated to be relevant to the changes made in 0.10.0:

Answer: yes, -k X --best will report the "best" alignments first.

Ben

But is if enough for me to use only -m 1 just to extract uniquely aligned hits allowing 2 mismatch as -v 2 ?
Thanks

**Ben Langmead** · 07-09-2009, 08:07 PM

Originally posted by seq_GA View Post

In both the above example, I tried using --concise as well as complete output format. And for the same sequence, even though it reports the same coordinates, ref_idx seems to be different for both the outputs.

Please let me know.

Hi seq_GA,

--concise rerports the reference according to its internal index, not its name. I.e., the '5' you're seeing is because internally, Bowtie identifies that chromosome as '5' (probably because when you built your index, it was the 6th sequence to be indexed; it's 0-based). If you ask for verbose (default) output and supply the --refidx option with your second input, you should also see '5' in the ref_id column.

Hope that makes sense,
Ben

**Ben Langmead** · 07-09-2009, 08:12 PM

Originally posted by seq_GA View Post

But is if enough for me to use only -m 1 just to extract uniquely aligned hits allowing 2 mismatch as -v 2 ?
Thanks

If you supply '-v 2 -m 1', Bowtie will report an alignment only for reads having 1 legal alignment, regardless of stratum. I.e., if a read has a 1-mismatch alignment and a 2-mismatch alignment, no alignments will be reported for that read. And if a read has just a 2-mismatch alignment, then that alignment will be reported. This is in contrast to stratified mode ('--best --strata'), where the best alignment would be reported in both cases.

Ben

**seq_GA** · 07-09-2009, 10:41 PM

Originally posted by Ben Langmead View Post

If you supply '-v 2 -m 1', Bowtie will report an alignment only for reads having 1 legal alignment, regardless of stratum. I.e., if a read has a 1-mismatch alignment and a 2-mismatch alignment, no alignments will be reported for that read. And if a read has just a 2-mismatch alignment, then that alignment will be reported. This is in contrast to stratified mode ('--best --strata'), where the best alignment would be reported in both cases.

Ben

Hi Ben,
Thanks for the clarification. It is still bit confusing. If I mention '-v 2 -m 1', then one alignment with 2 mismatches (condition) will only be reported?

I want to find only uniquely aligned reads with atmost 2 mismatches in the seed. My read length is 36bps. How do I set the parameters?

Regards

**Ben Langmead** · 07-10-2009, 05:24 AM

Originally posted by seq_GA View Post

Hi Ben,
If I mention '-v 2 -m 1', then one alignment with 2 mismatches (condition) will only be reported?

If you supply -m 1, Bowtie will suppress alignments for reads with more than 1 valid alignment.

Originally posted by seq_GA View Post

I want to find only uniquely aligned reads with atmost 2 mismatches in the seed. My read length is 36bps. How do I set the parameters?

You must pick a definition of "unique." If "unique" = there are no other alignments with the same number of mismatches, then use '--best -strata -m 1', (along with your alignment policy, e.g. '-v 2'). If "unique" = there are no other valid alignments period, then use '-m 1'. The former is stratified, the latter is unstratified.

Ben

**apostrophe** · 07-13-2009, 07:50 AM

Sorry if this has been answered before, but does Bowtie support FASTA nucleic acid codes that code for two bases, such as Y = T or C for the genome? Thanks in advance.

Topics	Statistics	Last Post
ASHG 2024 Highlights – Part Two by seqadmin Started by seqadmin, 11-08-2024, 11:09 AM	0 responses 35 views 0 likes	Last Post by seqadmin 11-08-2024, 11:09 AM
ASHG 2024 Highlights – Part One by seqadmin Started by seqadmin, 11-08-2024, 06:13 AM	0 responses 28 views 0 likes	Last Post by seqadmin 11-08-2024, 06:13 AM
Seq-Scope Expands Possibilities for High-Resolution Gene Expression Analysis by seqadmin Started by seqadmin, 11-01-2024, 06:09 AM	0 responses 32 views 0 likes	Last Post by seqadmin 11-01-2024, 06:09 AM
New Model Aims to Explain Polygenic Diseases by Connecting Genomic Mutations and Regulatory Networks by seqadmin Started by seqadmin, 10-30-2024, 05:31 AM	0 responses 23 views 0 likes	Last Post by seqadmin 10-30-2024, 05:31 AM

Seqanswers Leaderboard Ad

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News