Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Hi Ines,

    No, less memory != fast search. Not necessarily. (If the difference in footprint allows the problem to fit into a faster memory, then yes, less memory == fast search.) In this case, the BWT technique offers a combination of small memory footprint (far smaller than suffix arrays, suffix trees, and smaller than hash tables when the tables are built over the reference genome), and good performance. People often ask why BWT is faster than hash tables in certain situations, and it's hard to answer because so much depends on exactly what hash-based tool you're comparing against and what the reads and alignment policy look like. I suspect it chiefly comes down to minimizing cache misses and minimizing wasted work.

    Thanks,
    Ben

    Comment


    • In fact, Bowtie, as well as the other two BWT-based aligners, gives less information than Eland and Maq: information on suboptimal hits (e.g. the count of 1-mismatch and 2-mismatch hits). This is one of key factors that make Bowtie faster. The speed of Eland/Maq will remain the same if we do not ask them to report the counts because they check them anyway, but the speed of Bowtie/SOAP2/BWA will be reduced a lot. Probably they will be slower than Eland if we ask them to always do this counting for 32bp reads. Fortunately, the count of 2-mismatch hits is not frequently used and using this information or not does not affect SNP accuracy too much (but will affect a little). To this end, BWT-based aligners trades some minor information (and a little bit accuracy as well) for a great speed given high-quality reads. In all, it is worth trying BWT-based aligners.

      Comment


      • Hi Ben,

        Quick question - if you supply paired end arguments (--ff / -I / etc...) , but only supply a <singles> file (not <mates1/2>), will the PE args just be ignored?

        Similarly, if I specify --maxbts 400 (for example) as well as --best, will the --best overrule the --maxbts?

        Cheers,

        Ieuan
        Last edited by ieuanclay; 06-15-2009, 10:39 AM.

        Comment


        • also, i just noticed a typo in the manual (0.9.9.3) : for the --fr/rf/ff docs, you call --ff --ll. Maybe this isn't a typo and i am just being really dumb...

          Ieuan

          Comment


          • Hi Ieuan,

            Originally posted by ieuanclay View Post
            Quick question - if you supply paired end arguments (--ff / -I / etc...) , but only supply a <singles> file (not <mates1/2>), will the PE args just be ignored?
            Yes, they will.

            Originally posted by ieuanclay View Post
            Similarly, if I specify --maxbts 400 (for example) as well as --best, will the --best overrule the --maxbts?
            --best and --maxbts are compatible, so, no, --best does not overrule it. -y and --maxbts are mutually exclusive. If both are specified, -y will prevail.

            Thanks,
            Ben

            Comment


            • Great, thanks. I am writing an old skool perl GUI wrapper to handle all the input and forking off several alignment runs (separate from the internal forking that bowtie already does, i.e. if i want to align 4 files, it will run them as parallel children, keeping any output separate), and saving/loading the parameters i used. It is really only for my own use, but I can send it to you if you are interested? Please tell me to bugger off if i am stepping on toes!

              Ieuan

              Comment


              • Originally posted by ieuanclay View Post
                also, i just noticed a typo in the manual (0.9.9.3) : for the --fr/rf/ff docs, you call --ff --ll. Maybe this isn't a typo and i am just being really dumb...
                You're absolutely right. Sorry about that. I just fixed it on the web version of the manual and in the Bowtie repository. The fix to the MANUAL file in the Bowtie download will be reflected in the next release (after 0.10.0).

                Thanks,
                Ben

                Comment


                • Hi Ieuan,

                  If you don't mind sharing it, sure, please send it along. Note that the --mm option in the 0.10.0 release of Bowtie might be helpful to you if you're (a) running many concurrent bowtie processes that are searching against the same large index, and (b) memory is tight.

                  Thanks,
                  Ben

                  Comment


                  • I'll have a look, and no i don't mind sharing, as long as you promise not to laugh!

                    Ieuan

                    Comment


                    • Faster search for Bowtie

                      Hi lh3, you are right. The speed is dependent on the information it has to report.

                      For instance, I tried to run bowtie with the parameter -m set to 1 and it took 3hours, while previously it was taken 15min. I think it is very impressive that bowtie can do the alignment in 15 minutes. When I set “–m 1” bowtie will now suppress all alignments for a particular read with more then 1 alignments (while previously it doesn’t suppress any alignment). I believe that, by setting a limit to –m, bowtie has to process more information and thus takes more time..

                      Originally posted by lh3 View Post
                      The speed of Eland/Maq will remain the same if we do not ask them to report the counts because they check them anyway, but the speed of Bowtie/SOAP2/BWA will be reduced a lot..
                      Perhaps MAQ should have some more optional parameters. For e.g. we could choose to have all the reports we have now in MAQ, or we could choose to make it faster with less counts and so on… This would be great.
                      Last edited by inesdesantiago; 06-16-2009, 05:50 PM.

                      Comment


                      • Hi lh3, you are right. The speed is dependent on the information it has to report.

                        For instance, I tried to run bowtie with the parameter -m set to 1 and it took 3hours, while previously it was taken 15min. I think it is very impressive that bowtie can do the alignment in 15 minutes. When I set “–m 1” bowtie will now suppress all alignments for a particular read with more then 1 alignments (while previously it doesn’t suppress any alignment). I believe that, by setting a limit to –m, bowtie has to process more information and thus takes more time..

                        Quote:
                        Originally Posted by lh3
                        The speed of Eland/Maq will remain the same if we do not ask them to report the counts because they check them anyway, but the speed of Bowtie/SOAP2/BWA will be reduced a lot..


                        Perhaps MAQ should have some more optional parameters. For e.g. we could choose to have all the reports we have now in MAQ, or we could choose to make it faster with less counts and so on… This would be great.

                        Comment


                        • Hi Ben,

                          Wanted to report that bowtie does not ever 'finish', i.e. return the command line prompt and in 'top' it reports as still active, even though it has not written anything to file in a long time.

                          I saw this first on one machine but thought it was just something wrong with my install but now I have seen it on two machines. Are you familiar with this? I suppose it could still be something about my install.

                          I am running it in Ubuntu 8.04 on an IBM Intellistation (64-bit machine).

                          Chuck

                          Comment


                          • Originally posted by inesdesantiago View Post
                            When I set “–m 1” bowtie will now suppress all alignments for a particular read with more then 1 alignments (while previously it doesn’t suppress any alignment). I believe that, by setting a limit to –m, bowtie has to process more information and thus takes more time..
                            Note that a way to make this alignment scenario (-m 1 without --best --strata) far more efficient is to use a repeat-masked reference index and omit the -m 1 option.

                            Ben

                            Comment


                            • Originally posted by chuck View Post
                              Wanted to report that bowtie does not ever 'finish', i.e. return the command line prompt and in 'top' it reports as still active, even though it has not written anything to file in a long time.
                              Hi Chuck,

                              Please post the exact Bowtie version and arguments you're using. Also, please let me know if you see this problem when you use the latest version of Bowtie (0.10.0).

                              Thanks,
                              Ben

                              Comment


                              • Originally posted by Ben Langmead View Post
                                Note that a way to make this alignment scenario (-m 1 without --best --strata) far more efficient is to use a repeat-masked reference index and omit the -m 1 option.

                                Ben
                                Would it also be valid use the -k 2 option and throw out reads for which two alignments are reported? This is slower than alignment against a masked genome but faster than -m 1.

                                Comment

                                Latest Articles

                                Collapse

                                • seqadmin
                                  Choosing Between NGS and qPCR
                                  by seqadmin



                                  Next-generation sequencing (NGS) and quantitative polymerase chain reaction (qPCR) are essential techniques for investigating the genome, transcriptome, and epigenome. In many cases, choosing the appropriate technique is straightforward, but in others, it can be more challenging to determine the most effective option. A simple distinction is that smaller, more focused projects are typically better suited for qPCR, while larger, more complex datasets benefit from NGS. However,...
                                  10-18-2024, 07:11 AM
                                • seqadmin
                                  Non-Coding RNA Research and Technologies
                                  by seqadmin




                                  Non-coding RNAs (ncRNAs) do not code for proteins but play important roles in numerous cellular processes including gene silencing, developmental pathways, and more. There are numerous types including microRNA (miRNA), long ncRNA (lncRNA), circular RNA (circRNA), and more. In this article, we discuss innovative ncRNA research and explore recent technological advancements that improve the study of ncRNAs.

                                  Nobel Prize for MicroRNA Discovery
                                  This week,...
                                  10-07-2024, 08:07 AM

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by seqadmin, Yesterday, 05:31 AM
                                0 responses
                                10 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 10-24-2024, 06:58 AM
                                0 responses
                                20 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 10-23-2024, 08:43 AM
                                0 responses
                                50 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 10-17-2024, 07:29 AM
                                0 responses
                                58 views
                                0 likes
                                Last Post seqadmin  
                                Working...
                                X