Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • rhall
    replied
    I would suggest using the coverage reported during the resequencing for the most accurate measure of raw read coverage. The following script takes the coverage.bed file and outputs a median coverage for each contig https://gist.github.com/rhallPB/7275c48d6f166e1410df
    Last edited by rhall; 05-26-2015, 09:53 AM.

    Leave a comment:


  • cascoamarillo
    replied
    Hi guys,
    Let me continue this post with a question regarding CA output. In 9-terminator/ folder there is a summary of the assembly with Read Depth Histogram. There are contigs (consensus) with more read depth than others. I'd like to extract a subset of contigs with the maximun coverage reported. Is it possible to do that? Thanks.

    Leave a comment:


  • cascoamarillo
    replied
    Originally posted by GenoMax View Post
    Is the job running for 4 days or is it waiting for 4 days to get on the cluster?
    It's running. Apparently, it has finished with overlapStoreBuild process and now is dealing with the correct-frags.

    Leave a comment:


  • GenoMax
    replied
    Originally posted by cascoamarillo View Post
    Thank you all!
    Wow, how do you know where are those config files? Just set a bigger hard time. Anyway; it seems (de novo assembly with pacbio) is quite a long process. I place in parallel a subset of filtered subreads (120,000) to be assembled with CA (using 12 processors and 48G of memory) and still waiting after 4 days.
    Is the job running for 4 days or is it waiting for 4 days to get on the cluster?

    Leave a comment:


  • gconcepcion
    replied
    Originally posted by cascoamarillo View Post
    Thank you all!
    Wow, how do you know where are those config files? Just set a bigger hard time. Anyway; it seems (de novo assembly with pacbio) is quite a long process. I place in parallel a subset of filtered subreads to be assembled with CA (using 12 processors and 48G of memory) and still waiting after 4 days.
    Glad it worked out for you! rhall and I are both pacbio employees who frequent this forum from time to time and are happy to lend a helping hand when time permits.

    Alot of factors influence how long an assembly takes; e.g. cleanliness of library prep, size of genome, repetitiveness of genome, ploidy, quality and quantity of input data and the list goes on...

    Leave a comment:


  • cascoamarillo
    replied
    Thank you all!
    Wow, how do you know where are those config files? Just set a bigger hard time. Anyway; it seems (de novo assembly with pacbio) is quite a long process. I place in parallel a subset of filtered subreads (120,000) to be assembled with CA (using 12 processors and 48G of memory) and still waiting after 4 days.
    Last edited by cascoamarillo; 05-12-2015, 08:30 AM.

    Leave a comment:


  • gconcepcion
    replied
    Originally posted by cascoamarillo View Post
    So the scheduler is killing the job but we are not sure why it's being set. The job is being submitted with a "hard" resource list that includes the parameter h_rt=43200. This means the hard limit real time lifespan of the job is 43200 seconds (12H).
    This resource limit isn't something impose on the cluster. It's coming from whatever is submitting the job (SMRT Portal?).
    12H is the default time limit per task in a default SMRTAnalysis install.

    You can change that by following rhall's advice in a previous post and modifying the SGE scripts to increase the hard time limit that's already preset here:
    [smrtanalysis_install]/analysis/etc/cluster/SGE/interactive.tmpl*

    Leave a comment:


  • GenoMax
    replied
    I recollect that SMRTportal needs to be able to submit sub-jobs from the original job that gets launched. My hunch is that your SGE may not be set up to allow that. You can ask your admins to verify.

    Leave a comment:


  • rhall
    replied
    Check the sge template scripts in <SMRT Analysis>/analysis/etc/cluster/SGE/*.tmpl

    Leave a comment:


  • cascoamarillo
    replied
    So the scheduler is killing the job but we are not sure why it's being set. The job is being submitted with a "hard" resource list that includes the parameter h_rt=43200. This means the hard limit real time lifespan of the job is 43200 seconds (12H).
    This resource limit isn't something impose on the cluster. It's coming from whatever is submitting the job (SMRT Portal?).

    Leave a comment:


  • cascoamarillo
    replied
    running on a server (CentOS 6.5) with SGE. 32 cpus and 1024 GB.

    Thank you for point me in that direction> I'll ask my sys admin.

    Leave a comment:


  • gconcepcion
    replied
    Assuming you're running on SGE, one of two things happened:
    1) Your sys admin qdel'd your job (unlikely)
    2) Your job hit a resource limit, and SGE killed the job automatically either due to it's exceeding the time limit allowed for the job, cpu/memory limits.

    Talk with your sys admin and find out why the job may have been killed.

    Alternatively if you were running it locally, the job's memory consumption likely exceeded the system hardware.

    Leave a comment:


  • GenoMax
    replied
    What is the hardware specs for the server you are running this on? Are you using a cluster or a stand-alone server?

    Leave a comment:


  • cascoamarillo
    replied
    here they are:
    Code:
    Setting up ENV on cluster5-01.bpcservers.private for task hgapAlignForCorrection_001of006
    #!/bin/bash
    # Setting up SMRTpipe environment
    echo "Setting up ENV on $(uname -n)" for task hgapAlignForCorrection_001of006
    
    SEYMOUR_HOME=/smrtanalysis/install/smrtanalysis_2.3.0.140936
    source $SEYMOUR_HOME/etc/setup.sh
    
    # Create the local TMP dir if it doesn't exist
    tmp_dir=$(readlink -m "/smrtanalysis/tmpdir")
    if [ ! -e "$tmp_dir" ]; then
       stat=0
       mkdir -p $tmp_dir || stat=$?
       if [[ $stat -ne 0 ]]; then
           echo "SMRTpipe Unable to create TMP dir '/smrtanalysis/tmpdir' on $(uname -n)" 1>&2
           exit 1
       else
           echo "successfully created or found TMP dir '/smrtanalysis/tmpdir'"
       fi
    elif [[ ! -d "$tmp_dir" ]]; then
       echo "SMRTpipe TMP /smrtanalysis/tmpdir must be a directory on $(uname -n)" 1>&2
       exit 1
    fi
    
    ########### TASK metadata #############
    # Task            : hgapAlignForCorrection_001of006
    # Module          : P_PreAssemblerDagcon
    # Module Version  : 2.1.124285
    # TaskType        : None
    # URL             : task://016450/P_PreAssemblerDagcon/hgapAlignForCorrection_001of006
    # createdAt       : 2015-04-28 17:47:34.515890
    # createdAt (UTC) : 2015-04-28 21:47:34.515909
    # ncmds           : 2
    # LogPath         : /smrtanalysis/userdata/jobs/016/016450/log/P_PreAssemblerDagcon/hgapAlignForCorrection_001of006.log
    # Script Path     : /smrtanalysis/userdata/jobs/016/016450/workflow/P_PreAssemblerDagcon/hgapAlignForCorrection_001of006.sh
    
    # Input       : /smrtanalysis/userdata/jobs/016/016450/data/nocontrol_filtered_subreads.fasta
    # Input       : /smrtanalysis/userdata/jobs/016/016450/filtered_longreads.chunk001of006.fasta
    # Output      : /smrtanalysis/userdata/jobs/016/016450/seeds.chunk001of006.m4
    # Output      : /smrtanalysis/userdata/jobs/016/016450/seeds.chunk001of006.m4.fofn
    #
    ########### END TASK metadata #############
    
    cd /smrtanalysis/userdata/jobs/016/016450/log/P_PreAssemblerDagcon
    # Writing to log file
    cat /smrtanalysis/userdata/jobs/016/016450/workflow/P_PreAssemblerDagcon/hgapAlignForCorrection_001of006.sh >> /smrtanalysis/userdata/jobs/016/016450/log/P_PreAssemblerDagcon/hgapAlignForCorrection_001of006.log;
    
    
    
    echo "Running task://016450/P_PreAssemblerDagcon/hgapAlignForCorrection_001of006 on $(uname -a)"
    
    echo "Started on $(date -u)"
    echo 'Validating existence of Input Files'
    if [ -e /smrtanalysis/userdata/jobs/016/016450/data/nocontrol_filtered_subreads.fasta ]
    then
    echo 'Successfully found /smrtanalysis/userdata/jobs/016/016450/data/nocontrol_filtered_subreads.fasta'
    else
    echo 'WARNING: Unable to find necessary input file, or dir /smrtanalysis/userdata/jobs/016/016450/data/nocontrol_filtered_subreads.fasta.'
    fi
    if [ -e /smrtanalysis/userdata/jobs/016/016450/filtered_longreads.chunk001of006.fasta ]
    then
    echo 'Successfully found /smrtanalysis/userdata/jobs/016/016450/filtered_longreads.chunk001of006.fasta'
    else
    echo 'WARNING: Unable to find necessary input file, or dir /smrtanalysis/userdata/jobs/016/016450/filtered_longreads.chunk001of006.fasta.'
    fi
    echo 'Successfully validated input files'
    
    # Task hgapAlignForCorrection_001of006 commands:
    
    
    # Completed writing Task hgapAlignForCorrection_001of006 commands
    
    
    # Task 1
    blasr /smrtanalysis/userdata/jobs/016/016450/data/nocontrol_filtered_subreads.fasta /smrtanalysis/userdata/jobs/016/016450/filtered_longreads.chunk001of006.fasta -out /smrtanalysis/userdata/jobs/016/016450/seeds.chunk001of006.m4 -m 4 -nproc 1 -bestn 10 -nCandidates 10 -noSplitSubreads -minReadLength 200 -maxScore -1000 -maxLCPLength 16 || exit $?
    echo "Task 1 completed at $(date)"
    # Task 2
    echo /smrtanalysis/userdata/jobs/016/016450/seeds.chunk001of006.m4 > /smrtanalysis/userdata/jobs/016/016450/seeds.chunk001of006.m4.fofn || exit $?
    echo "Task 2 completed at $(date)"
    
    
    
    rcode=$?
    echo "Finished on $(date -u)"
    echo "Task hgapAlignForCorrection_001of006 with nproc 1 with exit code ${rcode}."
    exit ${rcode}Running task://016450/P_PreAssemblerDagcon/hgapAlignForCorrection_001of006 on Linux cluster5-01.bpcservers.private 2.6.32-431.el6.x86_64 #1 SMP Fri Nov 22 03:15:09 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
    Started on Wed Apr 29 01:52:18 UTC 2015
    Validating existence of Input Files
    Successfully found /smrtanalysis/userdata/jobs/016/016450/data/nocontrol_filtered_subreads.fasta
    Successfully found /smrtanalysis/userdata/jobs/016/016450/filtered_longreads.chunk001of006.fasta
    Successfully validated input files
    [INFO] 2015-04-28T21:52:18 [blasr] started.
    # Writing stdout and stderr from Popen:
    Your job 19310 ("Phga016450") has been submitted
    Job 19310 exited because of signal SIGKILL
    SIGKILL??
    Thanks

    Leave a comment:


  • gconcepcion
    replied
    Originally posted by cascoamarillo View Post
    Hi

    So I have 10 SMRT cells that I've been playing with. But de novo assembly with protocols RS_HGAP_Assembly.2 and .3 on SMRT Portal shows an error message during the process. Using smrtanalysis_2.3.0.140936.run with smrtanalysis-patch_2.3.0.140936.p3.run.

    Code:
    [INFO] 2015-04-28 22:02:21,943 [smrtpipe.status refreshTargets 409] Workflow Completion Status 139/212 in ( ...... 65%) tasks completed.
    [ERROR] 2015-04-29 09:52:35,198 [smrtpipe.status refreshTargets 413] *** Failed task task://016450/P_PreAssemblerDagcon/hgapAlignForCorrection_006of006
    [ERROR] 2015-04-29 09:52:35,199 [smrtpipe.status refreshTargets 413] *** Failed task task://016450/P_PreAssemblerDagcon/hgapAlignForCorrection_004of006
    [ERROR] 2015-04-29 09:52:35,199 [smrtpipe.status refreshTargets 413] *** Failed task task://016450/P_PreAssemblerDagcon/hgapAlignForCorrection_005of006
    [ERROR] 2015-04-29 09:52:35,199 [smrtpipe.status refreshTargets 413] *** Failed task task://016450/P_PreAssemblerDagcon/hgapAlignForCorrection_003of006
    [ERROR] 2015-04-29 09:52:35,200 [smrtpipe.status refreshTargets 413] *** Failed task task://016450/P_PreAssemblerDagcon/hgapAlignForCorrection_001of006
    [ERROR] 2015-04-29 09:52:35,200 [smrtpipe.status refreshTargets 413] *** Failed task task://016450/P_PreAssemblerDagcon/hgapAlignForCorrection_002of006
    [INFO] 2015-04-29 09:52:35,212 [smrtpipe.status execute 627] Found 6 failed tasks.
    [INFO] 2015-04-29 09:52:35,213 [smrtpipe.status execute 629] task hgapAlignForCorrection_004of006 FAILED
    [INFO] 2015-04-29 09:52:35,213 [smrtpipe.status execute 629] task hgapAlignForCorrection_003of006 FAILED
    [INFO] 2015-04-29 09:52:35,214 [smrtpipe.status execute 629] task hgapAlignForCorrection_006of006 FAILED
    [INFO] 2015-04-29 09:52:35,214 [smrtpipe.status execute 629] task hgapAlignForCorrection_001of006 FAILED
    [INFO] 2015-04-29 09:52:35,215 [smrtpipe.status execute 629] task hgapAlignForCorrection_002of006 FAILED
    [INFO] 2015-04-29 09:52:35,215 [smrtpipe.status execute 629] task hgapAlignForCorrection_005of006 FAILED
    [ERROR] 2015-04-29 09:53:06,302 [SMRTpipe.SmrtPipeMain run 608] SmrtExit task://016450/P_PreAssemblerDagcon/hgapAlignForCorrection_004of006 Failed task://016450/P_PreAssemblerDagcon/hgapAlignForCorrection_003of006 Failed task://016450/P_PreAssemblerDagcon/hgapAlignForCorrection_006of006 Failed task://016450/P_PreAssemblerDagcon/hgapAlignForCorrection_001of006 Failed task://016450/P_PreAssemblerDagcon/hgapAlignForCorrection_002of006 Failed task://016450/P_PreAssemblerDagcon/hgapAlignForCorrection_005of006 Failed
    that's part of log file where the errors start showing up. Not sure where is the problem.
    Could it be due to a genome size limitation? Working with eukaryote genome (ca. 200Mb)
    Thanks
    At this point the failure is at the alignment-for-correction phase which occurs prior to usage of the genome size information to restrict the number of reads being used for the assembly. This failure is unrelated to the genome size setting, though that will come into play later.

    Can you post the contents of:

    [JOB_DIR]/log/P_PreAssemblerDagcon/hgapAlignForCorrection_001of006.log

    Leave a comment:

Latest Articles

Collapse

  • seqadmin
    Best Practices for Single-Cell Sequencing Analysis
    by seqadmin



    While isolating and preparing single cells for sequencing was historically the bottleneck, recent technological advancements have shifted the challenge to data analysis. This highlights the rapidly evolving nature of single-cell sequencing. The inherent complexity of single-cell analysis has intensified with the surge in data volume and the incorporation of diverse and more complex datasets. This article explores the challenges in analysis, examines common pitfalls, offers...
    06-06-2024, 07:15 AM
  • seqadmin
    Latest Developments in Precision Medicine
    by seqadmin



    Technological advances have led to drastic improvements in the field of precision medicine, enabling more personalized approaches to treatment. This article explores four leading groups that are overcoming many of the challenges of genomic profiling and precision medicine through their innovative platforms and technologies.

    Somatic Genomics
    “We have such a tremendous amount of genetic diversity that exists within each of us, and not just between us as individuals,”...
    05-24-2024, 01:16 PM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, Yesterday, 06:58 AM
0 responses
13 views
0 likes
Last Post seqadmin  
Started by seqadmin, 06-06-2024, 08:18 AM
0 responses
20 views
0 likes
Last Post seqadmin  
Started by seqadmin, 06-06-2024, 08:04 AM
0 responses
18 views
0 likes
Last Post seqadmin  
Started by seqadmin, 06-03-2024, 06:55 AM
0 responses
13 views
0 likes
Last Post seqadmin  
Working...
X