We are working with sequences that have different lengths... 32, 31, 30 etc in one run. Does anyone know how to run Eland to allow for different lengths or do I have to run multiple times for each specified length?
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
We used to do that - run multiple times. Back when we ran eland, it used to require a separate compiled version for each sequence length, although that has probably changed in the meantime, so at that point there was no way it could be done.The more you know, the more you know you don't know. —Aristotle
Comment
-
You're right - we don't do that anymore. We used to do it because the tail end of eland reads was more error prone, so we'd sequentially strip each fragment down by 2 bases at a time, and then take the longest that would align. Now, we use aligners that take the quality into account, and can do more mismatches, so it's really unnecessary.
From my own personal experience with bowtie, I found it to be unreliable - it missed a lot of alignments, and Eland just wasn't flexible enough, so we've mainly moved to maq. (with a few other aligners fitting in to particular roles that they do well.) I understand that the author of maq is now moving to a new aligner (bwa), which uses the same burrows-wheeler algorithm as bowtie, but should be a better implementation. I haven't tried it yet, but you might want to check that out.
As for sequences of different lengths, I think maq should be able to handle those without any fancy tricks. However, if you want to do Eland, my best guess would be to write a parser that separates the reads (and the supporting files, if you use them) into independent files, which each have their own read-length, and then align each one individually. Personally, that sounds like a rather painful way to do it...
Good luck.The more you know, the more you know you don't know. —Aristotle
Comment
-
Originally posted by doxologist View Postin terms of bowtie missing A LOT of alignments... what do you mean by that?
Anecdotally, I've heard the same thing from other people who've used it as well. I don't want to suggest it's a bad application - it's a giant leap forward in many respects, but I suspect the implementation needs some fine tuning.The more you know, the more you know you don't know. —Aristotle
Comment
-
This is what Ben responded:
The circumstances under which Bowtie might miss alignments that are "valid" according to its alignment policy are outlined in the manual (see last paragraph of section "Maq-like Policy"). These misses only occur in -n 2 and -n 3 modes, and they can be avoided by increasing the --maxbts parameter (at the cost of some speed). Unless your read data is very low quality, the fraction of reads missed due to the backtracking limit in -n 2 mode is generally very small (<1%).
Note that when you run 'maq' with -n 2 option (the default), it will find some alignments that actually have 3 mismatches in the seed. Bowtie will *not* report alignments with 3 mismatches in the seed unless -n 3 is specified. It's likely that this is the source of the difference that the anecdotal reports are referring to.
Comment
-
Thanks - that doesn't seem to describe what I saw, or what was reported to me, but, as I said, it's quite possible that I did something wrong when I used it.
Edit: I should also add that I have been doing a lot of Paired End Tag sequencing lately, which is not yet supported by Bowtie - so there were other reasons for staying with Maq. I don't want to make it sound like this was the only reason we didn't stick with Bowtie.The more you know, the more you know you don't know. —Aristotle
Comment
-
I agree... I think the added functionality and versatility are the main reasons that people are sticking with MAQ. I remember reading Heng's comment somewhere that he wanted to emphasize that. I'm excited about how the SAM format would hopefully makes programs more able to talk to each other and more functional.
Comment
-
I'm also excited about the SAM format, as well, although the prospect of implementing it in Java is quite terrifying - particularly with the claim in the SAMTools manual that explicitly states that Java doesn't support multipart gzip files...
I'm sure it'll be a challenge!The more you know, the more you know you don't know. —Aristotle
Comment
Latest Articles
Collapse
-
by seqadmin
Metagenomics has improved the way researchers study microorganisms across diverse environments. Historically, studying microorganisms relied on culturing them in the lab, a method that limits the investigation of many species since most are unculturable1. Metagenomics overcomes these issues by allowing the study of microorganisms regardless of their ability to be cultured or the environments they inhabit. Over time, the field has evolved, especially with the advent...-
Channel: Articles
09-23-2024, 06:35 AM -
-
by seqadmin
During the COVID-19 pandemic, scientists observed that while some individuals experienced severe illness when infected with SARS-CoV-2, others were barely affected. These disparities left researchers and clinicians wondering what causes the wide variations in response to viral infections and what role genetics plays.
Jean-Laurent Casanova, M.D., Ph.D., Professor at Rockefeller University, is a leading expert in this crossover between genetics and infectious...-
Channel: Articles
09-09-2024, 10:59 AM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, 10-02-2024, 04:51 AM
|
0 responses
13 views
0 likes
|
Last Post
by seqadmin
10-02-2024, 04:51 AM
|
||
Started by seqadmin, 10-01-2024, 07:10 AM
|
0 responses
21 views
0 likes
|
Last Post
by seqadmin
10-01-2024, 07:10 AM
|
||
Started by seqadmin, 09-30-2024, 08:33 AM
|
0 responses
26 views
0 likes
|
Last Post
by seqadmin
09-30-2024, 08:33 AM
|
||
Started by seqadmin, 09-26-2024, 12:57 PM
|
0 responses
18 views
0 likes
|
Last Post
by seqadmin
09-26-2024, 12:57 PM
|
Comment