HGAP assembly failed task on SMRT Portal

rhall replied

05-26-2015, 09:50 AM
I would suggest using the coverage reported during the resequencing for the most accurate measure of raw read coverage. The following script takes the coverage.bed file and outputs a median coverage for each contig https://gist.github.com/rhallPB/7275c48d6f166e1410df

Last edited by rhall; 05-26-2015, 09:53 AM.
Leave a comment:
cascoamarillo replied

05-26-2015, 07:10 AM
Hi guys,
Let me continue this post with a question regarding CA output. In 9-terminator/ folder there is a summary of the assembly with Read Depth Histogram. There are contigs (consensus) with more read depth than others. I'd like to extract a subset of contigs with the maximun coverage reported. Is it possible to do that? Thanks.
Leave a comment:
cascoamarillo replied

05-12-2015, 04:20 PM
Originally posted by GenoMax View Post

Is the job running for 4 days or is it waiting for 4 days to get on the cluster?

It's running. Apparently, it has finished with overlapStoreBuild process and now is dealing with the correct-frags.
Leave a comment:
GenoMax replied

05-12-2015, 10:11 AM
Originally posted by cascoamarillo View Post

Thank you all!
Wow, how do you know where are those config files? Just set a bigger hard time. Anyway; it seems (de novo assembly with pacbio) is quite a long process. I place in parallel a subset of filtered subreads (120,000) to be assembled with CA (using 12 processors and 48G of memory) and still waiting after 4 days.

Is the job running for 4 days or is it waiting for 4 days to get on the cluster?
Leave a comment:
gconcepcion replied

05-12-2015, 08:30 AM
Originally posted by cascoamarillo View Post

Thank you all!
Wow, how do you know where are those config files? Just set a bigger hard time. Anyway; it seems (de novo assembly with pacbio) is quite a long process. I place in parallel a subset of filtered subreads to be assembled with CA (using 12 processors and 48G of memory) and still waiting after 4 days.

Glad it worked out for you! rhall and I are both pacbio employees who frequent this forum from time to time and are happy to lend a helping hand when time permits.

Alot of factors influence how long an assembly takes; e.g. cleanliness of library prep, size of genome, repetitiveness of genome, ploidy, quality and quantity of input data and the list goes on...
Leave a comment:
cascoamarillo replied

05-12-2015, 07:59 AM
Thank you all!
Wow, how do you know where are those config files? Just set a bigger hard time. Anyway; it seems (de novo assembly with pacbio) is quite a long process. I place in parallel a subset of filtered subreads (120,000) to be assembled with CA (using 12 processors and 48G of memory) and still waiting after 4 days.

Last edited by cascoamarillo; 05-12-2015, 08:30 AM.
Leave a comment:
gconcepcion replied

05-11-2015, 02:34 PM
Originally posted by cascoamarillo View Post

So the scheduler is killing the job but we are not sure why it's being set. The job is being submitted with a "hard" resource list that includes the parameter h_rt=43200. This means the hard limit real time lifespan of the job is 43200 seconds (12H).
This resource limit isn't something impose on the cluster. It's coming from whatever is submitting the job (SMRT Portal?).

12H is the default time limit per task in a default SMRTAnalysis install.

You can change that by following rhall's advice in a previous post and modifying the SGE scripts to increase the hard time limit that's already preset here:
[smrtanalysis_install]/analysis/etc/cluster/SGE/interactive.tmpl*
Leave a comment:
GenoMax replied

05-11-2015, 02:21 PM
I recollect that SMRTportal needs to be able to submit sub-jobs from the original job that gets launched. My hunch is that your SGE may not be set up to allow that. You can ask your admins to verify.
Leave a comment:
rhall replied

05-11-2015, 02:15 PM
Check the sge template scripts in <SMRT Analysis>/analysis/etc/cluster/SGE/*.tmpl
Leave a comment:
cascoamarillo replied

05-11-2015, 02:11 PM
So the scheduler is killing the job but we are not sure why it's being set. The job is being submitted with a "hard" resource list that includes the parameter h_rt=43200. This means the hard limit real time lifespan of the job is 43200 seconds (12H).
This resource limit isn't something impose on the cluster. It's coming from whatever is submitting the job (SMRT Portal?).
Leave a comment:
cascoamarillo replied

05-11-2015, 10:15 AM
running on a server (CentOS 6.5) with SGE. 32 cpus and 1024 GB.

Thank you for point me in that direction> I'll ask my sys admin.
Leave a comment:
gconcepcion replied

05-11-2015, 10:04 AM
Assuming you're running on SGE, one of two things happened:
1) Your sys admin qdel'd your job (unlikely)
2) Your job hit a resource limit, and SGE killed the job automatically either due to it's exceeding the time limit allowed for the job, cpu/memory limits.

Talk with your sys admin and find out why the job may have been killed.

Alternatively if you were running it locally, the job's memory consumption likely exceeded the system hardware.
Leave a comment:
GenoMax replied

05-11-2015, 10:00 AM
What is the hardware specs for the server you are running this on? Are you using a cluster or a stand-alone server?
Leave a comment:

cascoamarillo replied

05-11-2015, 09:10 AM

here they are:

Code:

Setting up ENV on cluster5-01.bpcservers.private for task hgapAlignForCorrection_001of006
#!/bin/bash
# Setting up SMRTpipe environment
echo "Setting up ENV on $(uname -n)" for task hgapAlignForCorrection_001of006

SEYMOUR_HOME=/smrtanalysis/install/smrtanalysis_2.3.0.140936
source $SEYMOUR_HOME/etc/setup.sh

# Create the local TMP dir if it doesn't exist
tmp_dir=$(readlink -m "/smrtanalysis/tmpdir")
if [ ! -e "$tmp_dir" ]; then
   stat=0
   mkdir -p $tmp_dir || stat=$?
   if [[ $stat -ne 0 ]]; then
       echo "SMRTpipe Unable to create TMP dir '/smrtanalysis/tmpdir' on $(uname -n)" 1>&2
       exit 1
   else
       echo "successfully created or found TMP dir '/smrtanalysis/tmpdir'"
   fi
elif [[ ! -d "$tmp_dir" ]]; then
   echo "SMRTpipe TMP /smrtanalysis/tmpdir must be a directory on $(uname -n)" 1>&2
   exit 1
fi

########### TASK metadata #############
# Task            : hgapAlignForCorrection_001of006
# Module          : P_PreAssemblerDagcon
# Module Version  : 2.1.124285
# TaskType        : None
# URL             : task://016450/P_PreAssemblerDagcon/hgapAlignForCorrection_001of006
# createdAt       : 2015-04-28 17:47:34.515890
# createdAt (UTC) : 2015-04-28 21:47:34.515909
# ncmds           : 2
# LogPath         : /smrtanalysis/userdata/jobs/016/016450/log/P_PreAssemblerDagcon/hgapAlignForCorrection_001of006.log
# Script Path     : /smrtanalysis/userdata/jobs/016/016450/workflow/P_PreAssemblerDagcon/hgapAlignForCorrection_001of006.sh

# Input       : /smrtanalysis/userdata/jobs/016/016450/data/nocontrol_filtered_subreads.fasta
# Input       : /smrtanalysis/userdata/jobs/016/016450/filtered_longreads.chunk001of006.fasta
# Output      : /smrtanalysis/userdata/jobs/016/016450/seeds.chunk001of006.m4
# Output      : /smrtanalysis/userdata/jobs/016/016450/seeds.chunk001of006.m4.fofn
#
########### END TASK metadata #############

cd /smrtanalysis/userdata/jobs/016/016450/log/P_PreAssemblerDagcon
# Writing to log file
cat /smrtanalysis/userdata/jobs/016/016450/workflow/P_PreAssemblerDagcon/hgapAlignForCorrection_001of006.sh >> /smrtanalysis/userdata/jobs/016/016450/log/P_PreAssemblerDagcon/hgapAlignForCorrection_001of006.log;



echo "Running task://016450/P_PreAssemblerDagcon/hgapAlignForCorrection_001of006 on $(uname -a)"

echo "Started on $(date -u)"
echo 'Validating existence of Input Files'
if [ -e /smrtanalysis/userdata/jobs/016/016450/data/nocontrol_filtered_subreads.fasta ]
then
echo 'Successfully found /smrtanalysis/userdata/jobs/016/016450/data/nocontrol_filtered_subreads.fasta'
else
echo 'WARNING: Unable to find necessary input file, or dir /smrtanalysis/userdata/jobs/016/016450/data/nocontrol_filtered_subreads.fasta.'
fi
if [ -e /smrtanalysis/userdata/jobs/016/016450/filtered_longreads.chunk001of006.fasta ]
then
echo 'Successfully found /smrtanalysis/userdata/jobs/016/016450/filtered_longreads.chunk001of006.fasta'
else
echo 'WARNING: Unable to find necessary input file, or dir /smrtanalysis/userdata/jobs/016/016450/filtered_longreads.chunk001of006.fasta.'
fi
echo 'Successfully validated input files'

# Task hgapAlignForCorrection_001of006 commands:


# Completed writing Task hgapAlignForCorrection_001of006 commands


# Task 1
blasr /smrtanalysis/userdata/jobs/016/016450/data/nocontrol_filtered_subreads.fasta /smrtanalysis/userdata/jobs/016/016450/filtered_longreads.chunk001of006.fasta -out /smrtanalysis/userdata/jobs/016/016450/seeds.chunk001of006.m4 -m 4 -nproc 1 -bestn 10 -nCandidates 10 -noSplitSubreads -minReadLength 200 -maxScore -1000 -maxLCPLength 16 || exit $?
echo "Task 1 completed at $(date)"
# Task 2
echo /smrtanalysis/userdata/jobs/016/016450/seeds.chunk001of006.m4 > /smrtanalysis/userdata/jobs/016/016450/seeds.chunk001of006.m4.fofn || exit $?
echo "Task 2 completed at $(date)"



rcode=$?
echo "Finished on $(date -u)"
echo "Task hgapAlignForCorrection_001of006 with nproc 1 with exit code ${rcode}."
exit ${rcode}Running task://016450/P_PreAssemblerDagcon/hgapAlignForCorrection_001of006 on Linux cluster5-01.bpcservers.private 2.6.32-431.el6.x86_64 #1 SMP Fri Nov 22 03:15:09 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
Started on Wed Apr 29 01:52:18 UTC 2015
Validating existence of Input Files
Successfully found /smrtanalysis/userdata/jobs/016/016450/data/nocontrol_filtered_subreads.fasta
Successfully found /smrtanalysis/userdata/jobs/016/016450/filtered_longreads.chunk001of006.fasta
Successfully validated input files
[INFO] 2015-04-28T21:52:18 [blasr] started.
# Writing stdout and stderr from Popen:
Your job 19310 ("Phga016450") has been submitted
Job 19310 exited because of signal SIGKILL

SIGKILL??
Thanks

Leave a comment:

gconcepcion replied

05-11-2015, 09:01 AM

Originally posted by cascoamarillo View Post

Hi

So I have 10 SMRT cells that I've been playing with. But de novo assembly with protocols RS_HGAP_Assembly.2 and .3 on SMRT Portal shows an error message during the process. Using smrtanalysis_2.3.0.140936.run with smrtanalysis-patch_2.3.0.140936.p3.run.

Code:

[INFO] 2015-04-28 22:02:21,943 [smrtpipe.status refreshTargets 409] Workflow Completion Status 139/212 in ( ...... 65%) tasks completed.
[ERROR] 2015-04-29 09:52:35,198 [smrtpipe.status refreshTargets 413] *** Failed task task://016450/P_PreAssemblerDagcon/hgapAlignForCorrection_006of006
[ERROR] 2015-04-29 09:52:35,199 [smrtpipe.status refreshTargets 413] *** Failed task task://016450/P_PreAssemblerDagcon/hgapAlignForCorrection_004of006
[ERROR] 2015-04-29 09:52:35,199 [smrtpipe.status refreshTargets 413] *** Failed task task://016450/P_PreAssemblerDagcon/hgapAlignForCorrection_005of006
[ERROR] 2015-04-29 09:52:35,199 [smrtpipe.status refreshTargets 413] *** Failed task task://016450/P_PreAssemblerDagcon/hgapAlignForCorrection_003of006
[ERROR] 2015-04-29 09:52:35,200 [smrtpipe.status refreshTargets 413] *** Failed task task://016450/P_PreAssemblerDagcon/hgapAlignForCorrection_001of006
[ERROR] 2015-04-29 09:52:35,200 [smrtpipe.status refreshTargets 413] *** Failed task task://016450/P_PreAssemblerDagcon/hgapAlignForCorrection_002of006
[INFO] 2015-04-29 09:52:35,212 [smrtpipe.status execute 627] Found 6 failed tasks.
[INFO] 2015-04-29 09:52:35,213 [smrtpipe.status execute 629] task hgapAlignForCorrection_004of006 FAILED
[INFO] 2015-04-29 09:52:35,213 [smrtpipe.status execute 629] task hgapAlignForCorrection_003of006 FAILED
[INFO] 2015-04-29 09:52:35,214 [smrtpipe.status execute 629] task hgapAlignForCorrection_006of006 FAILED
[INFO] 2015-04-29 09:52:35,214 [smrtpipe.status execute 629] task hgapAlignForCorrection_001of006 FAILED
[INFO] 2015-04-29 09:52:35,215 [smrtpipe.status execute 629] task hgapAlignForCorrection_002of006 FAILED
[INFO] 2015-04-29 09:52:35,215 [smrtpipe.status execute 629] task hgapAlignForCorrection_005of006 FAILED
[ERROR] 2015-04-29 09:53:06,302 [SMRTpipe.SmrtPipeMain run 608] SmrtExit task://016450/P_PreAssemblerDagcon/hgapAlignForCorrection_004of006 Failed task://016450/P_PreAssemblerDagcon/hgapAlignForCorrection_003of006 Failed task://016450/P_PreAssemblerDagcon/hgapAlignForCorrection_006of006 Failed task://016450/P_PreAssemblerDagcon/hgapAlignForCorrection_001of006 Failed task://016450/P_PreAssemblerDagcon/hgapAlignForCorrection_002of006 Failed task://016450/P_PreAssemblerDagcon/hgapAlignForCorrection_005of006 Failed

that's part of log file where the errors start showing up. Not sure where is the problem.
Could it be due to a genome size limitation? Working with eukaryote genome (ca. 200Mb)
Thanks

At this point the failure is at the alignment-for-correction phase which occurs prior to usage of the genome size information to restrict the number of reads being used for the assembly. This failure is unrelated to the genome size setting, though that will come into play later.

Can you post the contents of:

[JOB_DIR]/log/P_PreAssemblerDagcon/hgapAlignForCorrection_001of006.log

Topics	Statistics	Last Post
The Adaptation of the Cell Cycle in Multiciliated Cells by seqadmin Started by seqadmin, Yesterday, 06:58 AM	0 responses 13 views 0 likes	Last Post by seqadmin Yesterday, 06:58 AM
New Method for DNA Sequence Amplification by seqadmin Started by seqadmin, 06-06-2024, 08:18 AM	0 responses 20 views 0 likes	Last Post by seqadmin 06-06-2024, 08:18 AM
New Tools Enhance Single-Molecule DNA Analysis with Minimal Samples by seqadmin Started by seqadmin, 06-06-2024, 08:04 AM	0 responses 18 views 0 likes	Last Post by seqadmin 06-06-2024, 08:04 AM
SIX2 Protein Identified as a Key Player in Prostate Cancer Treatment Resistance by seqadmin Started by seqadmin, 06-03-2024, 06:55 AM	0 responses 13 views 0 likes	Last Post by seqadmin 06-03-2024, 06:55 AM

Seqanswers Leaderboard Ad

Announcement

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Latest Articles

ad_right_rmr

News