Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • guillaum
    Junior Member
    • Apr 2010
    • 3

    Bfast index creation

    Hi,

    I am trying Bfast for the first time and have some trouble with the creation of the index for the whole human genome.
    I want to create the 10 indexes as shown in the Bfast paper and experience very long running times (10 to 20 hours) for each one of them. The memory consumption goes up to 33 Gb, which arise difficulties and may cause this long running time , as maybe the system is swapping memory a lot. (I have a 64 Gb memory system, but I am not the only user)

    What is the typical running time for index creation on the human genome?

    I understand that if memory is the issue, I might try the "-d" parameter to split the index in parts, which leads to my second question :
    Does the index splitting has any performance impact on the next step of the algorithm, finding candidate alignment locations, and to what extent ?
    ( I suppose it has an impact , otherwise index splitting would be done by default, wouldn't it ?)


    Thanks !
  • nilshomer
    Nils Homer
    • Nov 2008
    • 1283

    #2
    Originally posted by guillaum View Post
    Hi,

    I am trying Bfast for the first time and have some trouble with the creation of the index for the whole human genome.
    I want to create the 10 indexes as shown in the Bfast paper and experience very long running times (10 to 20 hours) for each one of them. The memory consumption goes up to 33 Gb, which arise difficulties and may cause this long running time , as maybe the system is swapping memory a lot. (I have a 64 Gb memory system, but I am not the only user)

    What is the typical running time for index creation on the human genome?

    I understand that if memory is the issue, I might try the "-d" parameter to split the index in parts, which leads to my second question :
    Does the index splitting has any performance impact on the next step of the algorithm, finding candidate alignment locations, and to what extent ?
    ( I suppose it has an impact , otherwise index splitting would be done by default, wouldn't it ?)


    Thanks !
    For the human genome, index creation can easily run on an 8-core machine in 5-6 hours (remember to use multi-threading). Also, I regularly build such indexes on 32GB RAM machines. Could you give the command you are using to create the indexes?

    Index splitting (beyond "-d 1") has significant performance impact as this requires expensive merging of the each of the split indexes. For "-d 1" where there indexes are split into four pieces, the performance decrease (of the "match" step) is not too bad. If you have 24G or greater of RAM, you should not need to split the indexes.

    Comment

    • guillaum
      Junior Member
      • Apr 2010
      • 3

      #3
      Thanks for this information. So I should run it with "-d 1" if I have less than 24 Gb available.

      The command I used was

      ./bin/bfast index -f all.fasta -A 0 -m 1111011010001000110101100101100110100111 -t -w 14 -i 9

      Comment

      • nilshomer
        Nils Homer
        • Nov 2008
        • 1283

        #4
        Originally posted by guillaum View Post
        Thanks for this information. So I should run it with "-d 1" if I have less than 24 Gb available.

        The command I used was
        With "-d 1" it runs comfortably in 8GB of RAM. How big is all.fasta (3.2x10^9)?

        Comment

        Latest Articles

        Collapse

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by SEQadmin2, Today, 11:58 AM
        0 responses
        9 views
        0 reactions
        Last Post SEQadmin2  
        Started by SEQadmin2, 06-05-2026, 10:09 AM
        0 responses
        25 views
        0 reactions
        Last Post SEQadmin2  
        Started by SEQadmin2, 06-04-2026, 08:59 AM
        0 responses
        34 views
        0 reactions
        Last Post SEQadmin2  
        Started by SEQadmin2, 06-02-2026, 12:03 PM
        0 responses
        56 views
        0 reactions
        Last Post SEQadmin2  
        Working...