Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • Satya
    Junior Member
    • Jul 2014
    • 3

    gmap_build error

    Hi guys,
    I am in process of configuring GSNAP on the cluster of my university however I am repeatedly encountering an error in one step and I cant seem to solve it. I have installed the software on the cluster and am in the process of building the mm9 genome. I have followed the steps so far as per the documentation and gmap_build works fine until it reaches the step where it says on my console:

    Building suffix array
    SACA_K called with n = 2725765482, K = 5, level 0


    It is after this step that the process crashes and gives me an error message:

    /home/satyajit/GSNAP/bin/gmapindex -d mm9 -F /home/satyajit/GSNAP/gmap-2014-07-04/gmapdb/mm9 -D /home/satyajit/GSNAP/gmap-2014-07-04/gmapdb/mm9 -S failed with return code 131 at /home/satyajit/GSNAP/bin/gmap_build line 360.

    I have tried to run this installation several times now and on different machines as well and every time it crashes during this particular phase of configuration. The maximum memory I have used to configure this is a 64GB RAM with 16 cores of processing power on the cluster. Is this step the most memory intensive? Does it require even more memory than the one I have used? Or am I simply doing something fundamentally wrong? I am quite frankly at a loss about how to go forward tackling this issue and any help you could provide me with would be greatly appreciated.
    I plan on using GSNAP for SNP tolerant alignment in my datasets.
    The command I used for gmap_build is:

    gmap_build -d mm9 -g -k 15 chr1.fa.gz chr1_random.fa.gz chr2.fa.gz chr3_random.fa.gz chr3.fa.gz chr4_random.fa.gz chr4.fa.gz chr5_random.fa.gz chr5.fa.gz chr6.fa.gz chr7_random.fa.gz chr7.fa.gz chr8_random.fa.gz chr8.fa.gz chr9_random.fa.gz chr9.fa.gz chr10.fa.gz chr11.fa.gz chr12.fa.gz chr13_random.fa.gz chr13.fa.gz chr14.fa.gz chr15.fa.gz chr16_random.fa.gz chr16.fa.gz chr17_random.fa.gz chr17.fa.gz chr18.fa.gz chr19.fa.gz chrX_random.fa.gz chrX.fa.gz chrY_random.fa.gz chrY.fa.gz chrM.fa.gz chrUn_random.fa.gz
    Last edited by Satya; 07-15-2014, 11:42 AM.
  • GenoMax
    Senior Member
    • Feb 2008
    • 7142

    #2
    It appears that the build step requires sequence files to be uncompressed (https://github.com/julian-gehring/GMAP-GSNAP, look for section 4c). Have you tried using uncompressed sequence files?

    Comment

    • Satya
      Junior Member
      • Jul 2014
      • 3

      #3
      Isn't that the requirement for gmap_setup though? I thought gmap_build would accept gzipped files after using the -g option? It didn't work with uncompressed fastq files. I tried it out just in case right now.
      Last edited by Satya; 07-15-2014, 11:53 AM.

      Comment

      • GenoMax
        Senior Member
        • Feb 2008
        • 7142

        #4
        You are right there is a "-g" option mentioned for gmap_build.

        Out of curiosity can you try the build with a single uncompressed chromosome fasta file to see if it goes through?

        Comment

        • Satya
          Junior Member
          • Jul 2014
          • 3

          #5
          Excellent suggestion! It worked when I used just a single uncompressed fasta file. Does this mean this I need to simply allocate more memory for the entire process?

          Comment

          • GenoMax
            Senior Member
            • Feb 2008
            • 7142

            #6
            If you were passing that job along to a scheduler with a specific memory allocation then it would not hurt to increase that request.

            My hunch is that perhaps one of the chromosome files (*random*/ *un* come to mind as a culprit) may be causing the original error. You may have already tried this but I would say add a couple more chromosomes and see if that works and after that point everything except the random/un would be the next logical step to try.

            Comment

            • adeslat
              Junior Member
              • Mar 2011
              • 7

              #7
              Dear all,

              I resolved this by running the gmap_build on a larger machine. I also got this error and chased down many paths, in the end it was as simple as needing more memory.

              In my case, I was building hg19 to work with Pacific Biosciences ToFU command line pipeline. https://github.com/PacificBiosciences/cDNA_primer/wiki. I installed the latest gmap on an ubuntu instance, started through use of MIT's starcluster software http://star.mit.edu/cluster/about.html. Resolving the proper perl version (starcluster AMI instances are notoriously out of date, so the default perl version is too far gone, so I used the smrtanalysis version to get it correct.

              So success involved first setting two environmental variables:

              export PERL5LIB=/mnt/smrtanalysis/current/miscdeps/basesys/usr/lib64/perl5:/mnt/smrtanalysis/current/miscdeps/basesys/usr/lib64/perl5/5.8.8
              export PATH=/usr/local/bin:/usr/bin:/bin:$PATH



              After setting the path correctly got me to the point where I had the same error reported above:

              Building suffix array
              SACA_K called with n = 3137161265, K = 5, level 0
              Killed
              /usr/local/bin/gmapindex -d hg19 -F "/mnt/hg19/hg19" -D "/mnt/hg19/hg19" -S failed with return code 35072 at /mnt/\
              smrtanalysis/current/analysis/bin/gmap_build line 376.


              However, Genomax provided me the hint I needed. Rather than thinking I had anything else wrong, it was clearly worth trying a bigger box. Success came by running the software on a larger ubuntu instance - r3.8xlarge (240GB) machine. Which I instantiated and added to my configuration -- I logged into the new node and executed the command:

              gmap_build -s none -k 15 -d hg19 -D /mnt/hg19 /mnt/hg19/hg19.fa

              Successfully
              Last edited by adeslat; 02-15-2016, 06:07 AM.

              Comment

              Latest Articles

              Collapse

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by SEQadmin2, 06-09-2026, 11:58 AM
              0 responses
              27 views
              0 reactions
              Last Post SEQadmin2  
              Started by SEQadmin2, 06-05-2026, 10:09 AM
              0 responses
              34 views
              0 reactions
              Last Post SEQadmin2  
              Started by SEQadmin2, 06-04-2026, 08:59 AM
              0 responses
              40 views
              0 reactions
              Last Post SEQadmin2  
              Started by SEQadmin2, 06-02-2026, 12:03 PM
              0 responses
              62 views
              0 reactions
              Last Post SEQadmin2  
              Working...