Hello Everyone,
I am currently trying to assemble mammalian paired-end DNASeq files from Illumina using ABYSS. The problem is that ABYSS gets stuck during the Assembly and never finishes (or at least it didn't during seven days).
The data consists of two 80GB fastq files that I fed to ABYSS 1.5.2. After days, the terminal still shows
0: Reading `/data/reads_1.fq'...
1: Reading `/data/reads_2.fq'...
It always stays at this prompt if I try to read multiple files. I can see that no data is actually read into the RAM.
When I merge those files to a single one, the program reads data to the RAM, makes it to "Finding adjacenct k-mer..." and remains there.
I can see that the processes are still running, but no files are written to the disk. I am working on a single node with 32 Processors and 720GB RAM. The server uses SLURM and openMPI version 1.8.5.
I have read that sometimes the eager limit of MPI is to small, so I set it to a higher value with a given formula, but that didn't solve the problem.
The command I use:
If someone had a similar problem and would like to share some thoughts, any help is aprreciated.
EDIT
Thanks pmiguel for the remark.
With the "v=-vv" option, I got additional output. It seems as if ABYSS really takes that long to load the reads. In the beginning, it takes about 35 seconds to read 100.000 reads. That increases to about 1 minute and so on. I have the feeling, its the server that is so slow.
I am currently trying to assemble mammalian paired-end DNASeq files from Illumina using ABYSS. The problem is that ABYSS gets stuck during the Assembly and never finishes (or at least it didn't during seven days).
The data consists of two 80GB fastq files that I fed to ABYSS 1.5.2. After days, the terminal still shows
0: Reading `/data/reads_1.fq'...
1: Reading `/data/reads_2.fq'...
It always stays at this prompt if I try to read multiple files. I can see that no data is actually read into the RAM.
When I merge those files to a single one, the program reads data to the RAM, makes it to "Finding adjacenct k-mer..." and remains there.
I can see that the processes are still running, but no files are written to the disk. I am working on a single node with 32 Processors and 720GB RAM. The server uses SLURM and openMPI version 1.8.5.
I have read that sometimes the eager limit of MPI is to small, so I set it to a higher value with a given formula, but that didn't solve the problem.
The command I use:
Code:
sbatch --job-name=abyss --partition=hive --nodes=1 --ntasks-per-node=32 --wrap " abyss-pe k=51 n=10 name=test np=32 in='/data/reads_1.fq /data/reads_2.fq' "
EDIT
Thanks pmiguel for the remark.
With the "v=-vv" option, I got additional output. It seems as if ABYSS really takes that long to load the reads. In the beginning, it takes about 35 seconds to read 100.000 reads. That increases to about 1 minute and so on. I have the feeling, its the server that is so slow.
Comment