Dear all,
I resolved this by running the gmap_build on a larger machine. I also got this error and chased down many paths, in the end it was as simple as needing more memory.
In my case, I was building hg19 to work with Pacific Biosciences ToFU command line pipeline. https://github.com/PacificBiosciences/cDNA_primer/wiki. I installed the latest gmap on an ubuntu instance, started through use of MIT's starcluster software http://star.mit.edu/cluster/about.html. Resolving the proper perl version (starcluster AMI instances are notoriously out of date, so the default perl version is too far gone, so I used the smrtanalysis version to get it correct.
So success involved first setting two environmental variables:
export PERL5LIB=/mnt/smrtanalysis/current/miscdeps/basesys/usr/lib64/perl5:/mnt/smrtanalysis/current/miscdeps/basesys/usr/lib64/perl5/5.8.8
export PATH=/usr/local/bin:/usr/bin:/bin:$PATH
After setting the path correctly got me to the point where I had the same error reported above:
Building suffix array
SACA_K called with n = 3137161265, K = 5, level 0
Killed
/usr/local/bin/gmapindex -d hg19 -F "/mnt/hg19/hg19" -D "/mnt/hg19/hg19" -S failed with return code 35072 at /mnt/\
smrtanalysis/current/analysis/bin/gmap_build line 376.
However, Genomax provided me the hint I needed. Rather than thinking I had anything else wrong, it was clearly worth trying a bigger box. Success came by running the software on a larger ubuntu instance - r3.8xlarge (240GB) machine. Which I instantiated and added to my configuration -- I logged into the new node and executed the command:
gmap_build -s none -k 15 -d hg19 -D /mnt/hg19 /mnt/hg19/hg19.fa
Successfully
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
If you were passing that job along to a scheduler with a specific memory allocation then it would not hurt to increase that request.
My hunch is that perhaps one of the chromosome files (*random*/ *un* come to mind as a culprit) may be causing the original error. You may have already tried this but I would say add a couple more chromosomes and see if that works and after that point everything except the random/un would be the next logical step to try.
Leave a comment:
-
Excellent suggestion! It worked when I used just a single uncompressed fasta file. Does this mean this I need to simply allocate more memory for the entire process?
Leave a comment:
-
You are right there is a "-g" option mentioned for gmap_build.
Out of curiosity can you try the build with a single uncompressed chromosome fasta file to see if it goes through?
Leave a comment:
-
It appears that the build step requires sequence files to be uncompressed (https://github.com/julian-gehring/GMAP-GSNAP, look for section 4c). Have you tried using uncompressed sequence files?
Leave a comment:
-
gmap_build error
Hi guys,
I am in process of configuring GSNAP on the cluster of my university however I am repeatedly encountering an error in one step and I cant seem to solve it. I have installed the software on the cluster and am in the process of building the mm9 genome. I have followed the steps so far as per the documentation and gmap_build works fine until it reaches the step where it says on my console:
Building suffix array
SACA_K called with n = 2725765482, K = 5, level 0
It is after this step that the process crashes and gives me an error message:
/home/satyajit/GSNAP/bin/gmapindex -d mm9 -F /home/satyajit/GSNAP/gmap-2014-07-04/gmapdb/mm9 -D /home/satyajit/GSNAP/gmap-2014-07-04/gmapdb/mm9 -S failed with return code 131 at /home/satyajit/GSNAP/bin/gmap_build line 360.
I have tried to run this installation several times now and on different machines as well and every time it crashes during this particular phase of configuration. The maximum memory I have used to configure this is a 64GB RAM with 16 cores of processing power on the cluster. Is this step the most memory intensive? Does it require even more memory than the one I have used? Or am I simply doing something fundamentally wrong? I am quite frankly at a loss about how to go forward tackling this issue and any help you could provide me with would be greatly appreciated.
I plan on using GSNAP for SNP tolerant alignment in my datasets.
The command I used for gmap_build is:
gmap_build -d mm9 -g -k 15 chr1.fa.gz chr1_random.fa.gz chr2.fa.gz chr3_random.fa.gz chr3.fa.gz chr4_random.fa.gz chr4.fa.gz chr5_random.fa.gz chr5.fa.gz chr6.fa.gz chr7_random.fa.gz chr7.fa.gz chr8_random.fa.gz chr8.fa.gz chr9_random.fa.gz chr9.fa.gz chr10.fa.gz chr11.fa.gz chr12.fa.gz chr13_random.fa.gz chr13.fa.gz chr14.fa.gz chr15.fa.gz chr16_random.fa.gz chr16.fa.gz chr17_random.fa.gz chr17.fa.gz chr18.fa.gz chr19.fa.gz chrX_random.fa.gz chrX.fa.gz chrY_random.fa.gz chrY.fa.gz chrM.fa.gz chrUn_random.fa.gzLast edited by Satya; 07-15-2014, 11:42 AM.
Latest Articles
Collapse
-
by seqadmin
Innovations in next-generation sequencing technologies and techniques are driving more precise and comprehensive exploration of complex biological systems. Current advancements include improved accessibility for long-read sequencing and significant progress in single-cell and 3D genomics. This article explores some of the most impactful developments in the field over the past year.
Long-Read Sequencing
Long-read sequencing has...-
Channel: Articles
12-02-2024, 01:49 PM -
-
by seqadmin
The field of immunogenetics explores how genetic variations influence immune responses and susceptibility to disease. In a recent SEQanswers webinar, Oscar Rodriguez, Ph.D., Postdoctoral Researcher at the University of Louisville, and Ruben MartÃnez Barricarte, Ph.D., Assistant Professor of Medicine at Vanderbilt University, shared recent advancements in immunogenetics. This article discusses their research on genetic variation in antibody loci, antibody production processes,...-
Channel: Articles
11-06-2024, 07:24 PM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, 12-02-2024, 09:29 AM
|
0 responses
151 views
0 likes
|
Last Post
by seqadmin
12-02-2024, 09:29 AM
|
||
Started by seqadmin, 12-02-2024, 09:06 AM
|
0 responses
51 views
0 likes
|
Last Post
by seqadmin
12-02-2024, 09:06 AM
|
||
Started by seqadmin, 12-02-2024, 08:03 AM
|
0 responses
42 views
0 likes
|
Last Post
by seqadmin
12-02-2024, 08:03 AM
|
||
Started by seqadmin, 11-22-2024, 07:36 AM
|
0 responses
75 views
0 likes
|
Last Post
by seqadmin
11-22-2024, 07:36 AM
|
Leave a comment: