Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • jmartin
    Member
    • Dec 2009
    • 78

    BFAST indexing memory requirements

    I'm trying to get BFAST working as an aligner for me to use to attempt to detect human contamination in a bacterial metagenomic sample (everything will be 100mer Illumina reads). I am using the ensembl build 36 human genome + some additional novel regions from 2 other human genomes sequenced at the BGI. The total db size is ~3.0Gb, but it consists of 24 chromosomes that are VERY large, and then several thousand small sequences in addition to that. So its kind of a 'lopsided' database.

    I successfully ran 'bfast fasta2brg' on the file, but now for the 'bfast index' step I was using the '-d 1' parameter to reduce the memory footprint. From other threads I'd gotten the idea that using '-d 1' would probably keep my memory footprint down to ~8Gb. But all my blade jobs keep dying when I request only 8Gb of memory. What kind of memory can I expect my job to require?


    On another matter, I'm using the masks listed in the bfast manual for 'illumina reads > 40bp'. Should those be good enough for me to align Illumina 100mers, or would I be better off defining new masks? My goal is to identify human reads out from amongst bacterial sequences. So I believe I can be fairly relaxed in my search criteria without fear of falsely identifying bacterial reads as human.
  • nilshomer
    Nils Homer
    • Nov 2008
    • 1283

    #2
    Originally posted by jmartin View Post
    I'm trying to get BFAST working as an aligner for me to use to attempt to detect human contamination in a bacterial metagenomic sample (everything will be 100mer Illumina reads). I am using the ensembl build 36 human genome + some additional novel regions from 2 other human genomes sequenced at the BGI. The total db size is ~3.0Gb, but it consists of 24 chromosomes that are VERY large, and then several thousand small sequences in addition to that. So its kind of a 'lopsided' database.

    I successfully ran 'bfast fasta2brg' on the file, but now for the 'bfast index' step I was using the '-d 1' parameter to reduce the memory footprint. From other threads I'd gotten the idea that using '-d 1' would probably keep my memory footprint down to ~8Gb. But all my blade jobs keep dying when I request only 8Gb of memory. What kind of memory can I expect my job to require?


    On another matter, I'm using the masks listed in the bfast manual for 'illumina reads > 40bp'. Should those be good enough for me to align Illumina 100mers, or would I be better off defining new masks? My goal is to identify human reads out from amongst bacterial sequences. So I believe I can be fairly relaxed in my search criteria without fear of falsely identifying bacterial reads as human.
    I am would not expect more than 8GB is required when creating split indexes ("-d 1"). Nevertheless, in your case it looks like this is the case. Make sure you use the multi-threaded parameter nonetheless. Can you test with more memory?

    As for the 100bp data, the masks are great for 100bp data.

    Comment

    • jmartin
      Member
      • Dec 2009
      • 78

      #3
      I was able to successfully index using 24Gb memory per blade job. At some point I may throttle down the memory and see what the minimum I can get by with is for my db which may grow somewhat.

      Comment

      Latest Articles

      Collapse

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by SEQadmin2, 06-09-2026, 11:58 AM
      0 responses
      30 views
      0 reactions
      Last Post SEQadmin2  
      Started by SEQadmin2, 06-05-2026, 10:09 AM
      0 responses
      38 views
      0 reactions
      Last Post SEQadmin2  
      Started by SEQadmin2, 06-04-2026, 08:59 AM
      0 responses
      43 views
      0 reactions
      Last Post SEQadmin2  
      Started by SEQadmin2, 06-02-2026, 12:03 PM
      0 responses
      64 views
      0 reactions
      Last Post SEQadmin2  
      Working...