Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • adeslat
    replied
    Dear all,

    I resolved this by running the gmap_build on a larger machine. I also got this error and chased down many paths, in the end it was as simple as needing more memory.

    In my case, I was building hg19 to work with Pacific Biosciences ToFU command line pipeline. https://github.com/PacificBiosciences/cDNA_primer/wiki. I installed the latest gmap on an ubuntu instance, started through use of MIT's starcluster software http://star.mit.edu/cluster/about.html. Resolving the proper perl version (starcluster AMI instances are notoriously out of date, so the default perl version is too far gone, so I used the smrtanalysis version to get it correct.

    So success involved first setting two environmental variables:

    export PERL5LIB=/mnt/smrtanalysis/current/miscdeps/basesys/usr/lib64/perl5:/mnt/smrtanalysis/current/miscdeps/basesys/usr/lib64/perl5/5.8.8
    export PATH=/usr/local/bin:/usr/bin:/bin:$PATH



    After setting the path correctly got me to the point where I had the same error reported above:

    Building suffix array
    SACA_K called with n = 3137161265, K = 5, level 0
    Killed
    /usr/local/bin/gmapindex -d hg19 -F "/mnt/hg19/hg19" -D "/mnt/hg19/hg19" -S failed with return code 35072 at /mnt/\
    smrtanalysis/current/analysis/bin/gmap_build line 376.


    However, Genomax provided me the hint I needed. Rather than thinking I had anything else wrong, it was clearly worth trying a bigger box. Success came by running the software on a larger ubuntu instance - r3.8xlarge (240GB) machine. Which I instantiated and added to my configuration -- I logged into the new node and executed the command:

    gmap_build -s none -k 15 -d hg19 -D /mnt/hg19 /mnt/hg19/hg19.fa

    Successfully
    Last edited by adeslat; 02-15-2016, 06:07 AM.

    Leave a comment:


  • GenoMax
    replied
    If you were passing that job along to a scheduler with a specific memory allocation then it would not hurt to increase that request.

    My hunch is that perhaps one of the chromosome files (*random*/ *un* come to mind as a culprit) may be causing the original error. You may have already tried this but I would say add a couple more chromosomes and see if that works and after that point everything except the random/un would be the next logical step to try.

    Leave a comment:


  • Satya
    replied
    Excellent suggestion! It worked when I used just a single uncompressed fasta file. Does this mean this I need to simply allocate more memory for the entire process?

    Leave a comment:


  • GenoMax
    replied
    You are right there is a "-g" option mentioned for gmap_build.

    Out of curiosity can you try the build with a single uncompressed chromosome fasta file to see if it goes through?

    Leave a comment:


  • Satya
    replied
    Isn't that the requirement for gmap_setup though? I thought gmap_build would accept gzipped files after using the -g option? It didn't work with uncompressed fastq files. I tried it out just in case right now.
    Last edited by Satya; 07-15-2014, 11:53 AM.

    Leave a comment:


  • GenoMax
    replied
    It appears that the build step requires sequence files to be uncompressed (https://github.com/julian-gehring/GMAP-GSNAP, look for section 4c). Have you tried using uncompressed sequence files?

    Leave a comment:


  • Satya
    started a topic gmap_build error

    gmap_build error

    Hi guys,
    I am in process of configuring GSNAP on the cluster of my university however I am repeatedly encountering an error in one step and I cant seem to solve it. I have installed the software on the cluster and am in the process of building the mm9 genome. I have followed the steps so far as per the documentation and gmap_build works fine until it reaches the step where it says on my console:

    Building suffix array
    SACA_K called with n = 2725765482, K = 5, level 0


    It is after this step that the process crashes and gives me an error message:

    /home/satyajit/GSNAP/bin/gmapindex -d mm9 -F /home/satyajit/GSNAP/gmap-2014-07-04/gmapdb/mm9 -D /home/satyajit/GSNAP/gmap-2014-07-04/gmapdb/mm9 -S failed with return code 131 at /home/satyajit/GSNAP/bin/gmap_build line 360.

    I have tried to run this installation several times now and on different machines as well and every time it crashes during this particular phase of configuration. The maximum memory I have used to configure this is a 64GB RAM with 16 cores of processing power on the cluster. Is this step the most memory intensive? Does it require even more memory than the one I have used? Or am I simply doing something fundamentally wrong? I am quite frankly at a loss about how to go forward tackling this issue and any help you could provide me with would be greatly appreciated.
    I plan on using GSNAP for SNP tolerant alignment in my datasets.
    The command I used for gmap_build is:

    gmap_build -d mm9 -g -k 15 chr1.fa.gz chr1_random.fa.gz chr2.fa.gz chr3_random.fa.gz chr3.fa.gz chr4_random.fa.gz chr4.fa.gz chr5_random.fa.gz chr5.fa.gz chr6.fa.gz chr7_random.fa.gz chr7.fa.gz chr8_random.fa.gz chr8.fa.gz chr9_random.fa.gz chr9.fa.gz chr10.fa.gz chr11.fa.gz chr12.fa.gz chr13_random.fa.gz chr13.fa.gz chr14.fa.gz chr15.fa.gz chr16_random.fa.gz chr16.fa.gz chr17_random.fa.gz chr17.fa.gz chr18.fa.gz chr19.fa.gz chrX_random.fa.gz chrX.fa.gz chrY_random.fa.gz chrY.fa.gz chrM.fa.gz chrUn_random.fa.gz
    Last edited by Satya; 07-15-2014, 11:42 AM.

Latest Articles

Collapse

  • seqadmin
    Recent Advances in Sequencing Technologies
    by seqadmin







    Innovations in next-generation sequencing technologies and techniques are driving more precise and comprehensive exploration of complex biological systems. Current advancements include improved accessibility for long-read sequencing and significant progress in single-cell and 3D genomics. This article explores some of the most impactful developments in the field over the past year.

    Long-Read Sequencing
    Long-read sequencing has...
    12-02-2024, 01:49 PM
  • seqadmin
    Genetic Variation in Immunogenetics and Antibody Diversity
    by seqadmin



    The field of immunogenetics explores how genetic variations influence immune responses and susceptibility to disease. In a recent SEQanswers webinar, Oscar Rodriguez, Ph.D., Postdoctoral Researcher at the University of Louisville, and Ruben Martínez Barricarte, Ph.D., Assistant Professor of Medicine at Vanderbilt University, shared recent advancements in immunogenetics. This article discusses their research on genetic variation in antibody loci, antibody production processes,...
    11-06-2024, 07:24 PM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, 12-02-2024, 09:29 AM
0 responses
151 views
0 likes
Last Post seqadmin  
Started by seqadmin, 12-02-2024, 09:06 AM
0 responses
51 views
0 likes
Last Post seqadmin  
Started by seqadmin, 12-02-2024, 08:03 AM
0 responses
42 views
0 likes
Last Post seqadmin  
Started by seqadmin, 11-22-2024, 07:36 AM
0 responses
75 views
0 likes
Last Post seqadmin  
Working...
X