Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to speed up BFAST run time?

    Hi guys,

    I got warm recommendations on BFAST and decided to try it on my SOLID human sequencing data.
    I got to the 'bfast match' stage, but it takes ages to run. Going from a single thread to multiple did not affect the running time significantly (what threading library is used here?).

    I'll be grateful if someone can share their experience on how to speed this process up.

    Thanks,
    Amit

  • #2
    With shorter reads you would speed up your alignments by creating more than one index. With reads longer than 100bp you could get away with one index on a human sized genome. The authors use 10 indexes for the human genome in their example.

    How many did you create?

    Comment


    • #3
      I used a single index.

      How do you create multiple indexes?
      And what do you do with them? Do you then run multiple alignments in parallel?

      Amit

      Comment


      • #4
        Take a look at the supplementary material which goes with the original BFAST publication. It explains all of the parameters in detail. It is quiet complex, and requires some exploration of parameters which will suit your genome of interest. Lucky for you the authors provide 10 binary keys for which you can use to execute 10 indexes within a human genome. If you follow the methods from the publication you should get decent recults in minimal time (they optimized their protocol against a human genome).

        Comment


        • #5
          ps.
          multiple indexs are run separately, but in total will require more memory

          Comment


          • #6
            Thanks for the help, but I don't completely understand.
            How can multiple binary keys speed up the search?

            Comment


            • #7
              The publication explains

              Comment


              • #8
                Do you work with human genome?
                If you do, can you share the commands you use when you run it and how long it takes?

                Thanks

                Comment


                • #9
                  I dont work on human genome. You can get all the info from the paper to duplicate their run.
                  From what I read, for 10 indexes of the human genomes they used
                  a key size of 22
                  Hash width of 14
                  K cals (k) of 8
                  M = 1280

                  Comment


                  • #10
                    Sorry, I read the supplements but I still don't understand.

                    Do you use bfast regularly?
                    How did you index your reference?

                    Comment


                    • #11
                      no i dont. email the bfast mailing list for help

                      Comment

                      Latest Articles

                      Collapse

                      • seqadmin
                        Essential Discoveries and Tools in Epitranscriptomics
                        by seqadmin




                        The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                        04-22-2024, 07:01 AM
                      • seqadmin
                        Current Approaches to Protein Sequencing
                        by seqadmin


                        Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                        04-04-2024, 04:25 PM

                      ad_right_rmr

                      Collapse

                      News

                      Collapse

                      Topics Statistics Last Post
                      Started by seqadmin, 04-25-2024, 11:49 AM
                      0 responses
                      19 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 04-24-2024, 08:47 AM
                      0 responses
                      17 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 04-11-2024, 12:08 PM
                      0 responses
                      62 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 04-10-2024, 10:19 PM
                      0 responses
                      60 views
                      0 likes
                      Last Post seqadmin  
                      Working...
                      X