How does Bowtie handle ambiguous bases in the refgenome
Does anybody have experience in preparing a Bowtie search index where certain bases have been modified with ambiguous bases like "Y" which stands for "C" or "T" and if so will these locations be called matches or missmatches if the to be aligned Solexa read has either a "C" or a "T" at that position.
Thanks
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
My worry is that I will lose alignments which have perfect alignments if I also have other near matches. I guess I can make -n smaller (1 say). But I suppose the caveat is that if there is an exact match and a near match, you cannot say which is the correct one - sequencing error etc... - and so it is conservative to reject the read as having multiple alignments?
Another alternative is to do multiple Bowtie runs with decreasingly stringent alignment policies (e.g. -n 0, then -n 1, etc). The input to each run might the the --unfq reads from the run before.
I guess what I really want to know is: how stringent are the default alignment settings? Can I make them more stringent without losing a lot of 'true' (but imperfect) alignments?
Hope that helps,
Ben
Leave a comment:
-
Hi Ben,
Thanks for your help so far - I am relatively new to mapping (but not so new that I am not impressed by bowtie!), so please excuse any dopey questions...
My worry is that I will lose alignments which have perfect alignments if I also have other near matches. I guess I can make -n smaller (1 say). But I suppose the caveat is that if there is an exact match and a near match, you cannot say which is the correct one - sequencing error etc... - and so it is conservative to reject the read as having multiple alignments?
What I work on it is really important to be very sure about where the reads map... so maybe it would be good to keep -n at 1 and be more confident about the reads? I don't want to have to refer back to alignment confidences in analyses later on, but say that beyond a certain confidence threshold I am happy with them all. If I am going to reduce -n, should I also reduce -l to 20 or 25?
I guess what I really want to know is: how stringent are the default alignment settings? Can I make them more stringent without losing a lot of 'true' (but imperfect) alignments?
Thanks again,
Ieuan
Leave a comment:
-
so if a read has multiple valid alignments, one of which is better than the others (fewer mismatches, though the others are still valid), and i specify -k 1 -m 1, will the best alignment be given, or will it be pumped into --maxfa ?
If I am worried about this sort of situation, should i specify --best?
If this poses a problem, I'd be interested to hear more about what you're looking for...
Thanks,
Ben
Leave a comment:
-
Sorry to keep on about this, I just want to get it clear.
By default:
-k is 1, so only one (the best according -n-v-l-e) alignment is reported.
-m is unlimited
so if a read has multiple valid alignments, one of which is better than the others (fewer mismatches, though the others are still valid), and i specify -k 1 -m 1, will the best alignment be given, or will it be pumped into --maxfa ?
If I am worried about this sort of situation, should i specify --best?
Leave a comment:
-
Yes - if you specify -k > 1 or -a, Bowtie will output the appropriate number of hits per read for reads with >=1 hit. If -m <int> is also specified, Bowtie will output no alignments for reads with > <int> alignments and, if --maxfa/--maxfq is specified, will dump those reads (the reads, not the alignments) to the specified file. For reads with <= <int> alignments, Bowtie behaves the same as if -m were not specified.
I hope that helps.
Leave a comment:
-
Thanks!
Have you checked out the -m and --maxfa/--maxfq options?
Leave a comment:
-
Hi Ben, Just to say bowtie is great work. Far outstrips any pipeline we have used previously!
One question though - is there a way to output reads with multiple hits to a separate file? We work on repetitive regions and with a little massaging, this data may still be useful to us.
Leave a comment:
-
Is Bowtie suitable for miRNA detection
I am just playing around with bowtie along with other software (maq,novoalign) and was wondering whether bowtie is suitable for use with an miRNA detection experiment. In a previous post Ben states that:
Originally posted by Ben Langmead View PostFirst, let me reemphasize that I think of Bowtie's target application as mammalian resequencing - that's how I characterize it in the manual and that's what we spend our time trying to optimize it for.
I would think that you want to know all the alignments for each read above a certain quality threshold. At the moment I am thinking of using "--best -k 100", as if there is more that 100 hits then it is probably not a "real" alignement.
Any thoughts?
Leave a comment:
-
That sounds like an issue with how the read file is formatted. Can you share that file with me, e.g. via email (langmead at umd dot edu)? I can take a quick look.
Leave a comment:
-
bowtie error
I am just starting out with bowtie and I am getting the following error:
Code:$ ./bowtie -p 4 -t h_sapiens ../GDB1.fastq GDB1.map Time loading forward index: 00:00:01 Time loading mirror index: 00:00:02 Error: Read (Error: Read (Error: Read (Error: Read (HHHHWWWWIIII----EEEEAAAASSSS222266669999BBBB::::5555::::1464::::711344088362:84:1::1818164431573) is less than 2 characters long7) is less than 2 characters long) is less than 2 characters long) is less than 2 characters long
Leave a comment:
-
Originally posted by danielsbrewer View PostDoes anyone know whether bowtie supports aligning multiple read lengths?
I am doing small RNA Solexa sequencing and so after the adapter has been removed I end up with variable length reads. With MAQ it appears that you have to run it multiple times for the different lengths, is bowtie the same?
Basically, ELAND doesn't allow for different lengths and Bowtie does.
Leave a comment:
-
Originally posted by danielsbrewer View PostDoes anyone know whether bowtie supports aligning multiple read lengths?
I am doing small RNA Solexa sequencing and so after the adapter has been removed I end up with variable length reads. With MAQ it appears that you have to run it multiple times for the different lengths, is bowtie the same?
-Adam
Leave a comment:
Latest Articles
Collapse
-
by seqadmin
The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...-
Channel: Articles
05-06-2024, 07:48 AM -
-
by seqadmin
The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...-
Channel: Articles
04-22-2024, 07:01 AM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, Yesterday, 06:35 AM
|
0 responses
14 views
0 likes
|
Last Post
by seqadmin
Yesterday, 06:35 AM
|
||
Started by seqadmin, 05-09-2024, 02:46 PM
|
0 responses
18 views
0 likes
|
Last Post
by seqadmin
05-09-2024, 02:46 PM
|
||
Started by seqadmin, 05-07-2024, 06:57 AM
|
0 responses
17 views
0 likes
|
Last Post
by seqadmin
05-07-2024, 06:57 AM
|
||
Started by seqadmin, 05-06-2024, 07:17 AM
|
0 responses
19 views
0 likes
|
Last Post
by seqadmin
05-06-2024, 07:17 AM
|
Leave a comment: