my allpaths exiting with error message in the que and the log file is empty no error msgs also somebody help me to solve this issue thanks in advance
my pbs script
#!/bin/bash
#PBS -l walltime=48:00:00
#PBS -N 268_allpaths
#PBS -q workq
#PBS -l select=40:ncpus=16:mpiprocs=16
#PBS -l place=scatter:excl
#PBS -V
# comment begins with # followed by space......[IMPORTANT]
# Go to the directory from which you submitted the job
# cd $PBS_O_WORKDIR
module load all_paths-2.2
# path of all_paths
# path : /app/allpathslg
module load openmpi-1.6.4
# ulimit (stack) is needed by the allpaths program
ulimit -s 100000
# prepare data for allpaths:
PrepareAllPathsInput\
DATA_DIR=$PWD/scratch/268_allpaths\
PLOIDY=1\
IN_GROUPS_CSV=/scratch/268_allpaths/in_groups.csv\
IN_LIBS_CSV=/scratch/268_allpaths/in_libs.csv\
OVERWRITE=True\
| tee prepare.out
# Assemble data:
allpathslg\
PRE=$PWD\
DATA_SUBDIR=data\
RUN=run\
SUBDIR=test\
OVERWRITE=True\
| tee -a assemble.out
my csv files
in_groups.csv
file_name, library_name, group_name
/scratch/268_allpaths/SO_2511_268_R1.fastq, illumina, frags
/scratch/268_allpaths/SO_2511_268_R2.fastq, illumina, frags
/scratch/268_allpaths/SO_2511_268_R1.fastq.gz, illumina_short, jumping
/scratch/268_allpaths/SO_2511_268_R2.fastq.gz, illumina_short, jumping
in_libs.csv
library_name,project_name,organism_name,paired,frag_size,insert_size,read_orientation,genomic_start,genomic_end
illumina,testassembly,268,1,300bp,480bp,inward,0,0
illumina,testassembly,268,1,300bp,480bp,inward,0,0
illumina,testassembly,268,1,300bp,2k,outward,0,0
illumina,textassembly,268,1,300bp,2k,outward,0,0
this is my command line still my assembly is not running, thanks alot for help
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
Actually, I've sent a message to Dr. Lander and Dr. Jaffe, the authors of AllPaths-LG. I asked wether my data is suitable for this tool (PacBio and IonTorrent sequences from an enrichment culture) and how should I run this tool.
David Jaffe replied me. He said that unfortunately my data is not supported by their program and I should try something else with another tool. Plus he suggested me to join the google groups forum of AllPaths, so I could find some information.
I've sent a request to join the group 3 weeks ago but they didn't replied me until now.
So I lost hope in Allpaths-LG and now I'm trying to use IDBA_UD and MIRA assemblers for my data. They are still running the assembly.
So my conclusion is that AllPathsLG doesn't support hybrid assembly of PacBio+IonTorrent.
Please, let me know if you know any other programs that could work for this data type.
Leave a comment:
-
Originally posted by rdlady View PostI Have the same problem when I run ALLPATH-LG. I'm trying to assemble PacBio sequences and when I run the script PrepareAllPathsInputs.pl it reports these warnings and errors:
==================== WARNINGS ====================
!!!! No 'frag' cached read groups found.
YOU CAN'T RUN AND ASSEMBLY WITHOUT FRAGMENT READS!
Remember, fragment libraries must have empty 'insert_size' and 'insert_stddev' in the 'in_libs.csv'.
!!!! No 'jump' cached read groups found.
YOU CAN'T RUN AND ASSEMBLY WITHOUT JUMPING READS!
Remember, jumping libraries must have 'insert_size' and 'insert_stddev' defined in the 'in_libs.csv'.
!!!! No 'long_jump' cached read groups found.
Long jumping reads (typically 40 kb, < 1x coverage) are useful only for scaffolding of vertebrate size genomes, and are not required for an assembly.
==================================================
---- 2013-04-29 11:02:08 (CTAPI): Creating '/scratch/hpc/raquel/allpaths/data/ploidy' with PLOIDY = 1.
**** Can't find '/scratch/hpc/raquel/allpaths/data/frag_reads_orig.fastb'. You can't run an assembly without this file.
**** Can't find '/scratch/hpc/raquel/allpaths/data/frag_reads_orig.qualb'. You can't run an assembly without this file.
**** Can't find '/scratch/hpc/raquel/allpaths/data/frag_reads_orig.pairs'. You can't run an assembly without this file.
**** Can't find '/scratch/hpc/raquel/allpaths/data/jump_reads_orig.fastb'. You can't run an assembly without this file.
**** Can't find '/scratch/hpc/raquel/allpaths/data/jump_reads_orig.qualb'. You can't run an assembly without this file.
**** Can't find '/scratch/hpc/raquel/allpaths/data/jump_reads_orig.pairs'. You can't run an assembly without this file.
**** 2013-04-29 11:02:08 (CTAPI): Found 6 errors.
---- 2013-04-29 11:02:08 (CTAPI): Done.
---- 2013-04-29 11:02:08 (PAPI): Done.
I know this might be a late response, but I believe that there is a problem with your in_libs.csv file. Have you got it to work since?
Leave a comment:
-
Originally posted by liviu View PostI run ALLPATH-LG for staph.tar.gz, .prepare.sh went fine but ./assemble.sh retuned
Unable to find optional long jumping reads (>20kb) for scaffolding.
Unable to find optional long reads for patching. [These are optional, I don't think this is the main issue]
I'm a newbie. Did I pick the wrong data?
Thanks!!
==================== WARNINGS ====================
!!!! No 'frag' cached read groups found.
YOU CAN'T RUN AND ASSEMBLY WITHOUT FRAGMENT READS!
Remember, fragment libraries must have empty 'insert_size' and 'insert_stddev' in the 'in_libs.csv'.
!!!! No 'jump' cached read groups found.
YOU CAN'T RUN AND ASSEMBLY WITHOUT JUMPING READS!
Remember, jumping libraries must have 'insert_size' and 'insert_stddev' defined in the 'in_libs.csv'.
!!!! No 'long_jump' cached read groups found.
Long jumping reads (typically 40 kb, < 1x coverage) are useful only for scaffolding of vertebrate size genomes, and are not required for an assembly.
==================================================
---- 2013-04-29 11:02:08 (CTAPI): Creating '/scratch/hpc/raquel/allpaths/data/ploidy' with PLOIDY = 1.
**** Can't find '/scratch/hpc/raquel/allpaths/data/frag_reads_orig.fastb'. You can't run an assembly without this file.
**** Can't find '/scratch/hpc/raquel/allpaths/data/frag_reads_orig.qualb'. You can't run an assembly without this file.
**** Can't find '/scratch/hpc/raquel/allpaths/data/frag_reads_orig.pairs'. You can't run an assembly without this file.
**** Can't find '/scratch/hpc/raquel/allpaths/data/jump_reads_orig.fastb'. You can't run an assembly without this file.
**** Can't find '/scratch/hpc/raquel/allpaths/data/jump_reads_orig.qualb'. You can't run an assembly without this file.
**** Can't find '/scratch/hpc/raquel/allpaths/data/jump_reads_orig.pairs'. You can't run an assembly without this file.
**** 2013-04-29 11:02:08 (CTAPI): Found 6 errors.
---- 2013-04-29 11:02:08 (CTAPI): Done.
---- 2013-04-29 11:02:08 (PAPI): Done.
Leave a comment:
-
Is this the data from GAGE (http://gage.cbcb.umd.edu/data/index.html)? There is no 20 kb library for that, and as you can see, it's optional anyway.
What kind of scripts are you using? Where are they from? If assemble.sh didn't work, the reason is something else than that error message you posted.
Leave a comment:
-
Is there a configuration file. And if yes, does it refer to this 20 kb library ?
Leave a comment:
-
AllPaths-LG error
I run ALLPATH-LG for staph.tar.gz, .prepare.sh went fine but ./assemble.sh retuned
Unable to find optional long jumping reads (>20kb) for scaffolding.
Unable to find optional long reads for patching. [These are optional, I don't think this is the main issue]
I'm a newbie. Did I pick the wrong data?
Thanks!!Last edited by liviu; 05-31-2012, 07:23 AM.Tags: None
Latest Articles
Collapse
-
by seqadmin
The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...-
Channel: Articles
04-22-2024, 07:01 AM -
-
by seqadmin
Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...-
Channel: Articles
04-04-2024, 04:25 PM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, Yesterday, 11:49 AM
|
0 responses
13 views
0 likes
|
Last Post
by seqadmin
Yesterday, 11:49 AM
|
||
Started by seqadmin, 04-24-2024, 08:47 AM
|
0 responses
16 views
0 likes
|
Last Post
by seqadmin
04-24-2024, 08:47 AM
|
||
Started by seqadmin, 04-11-2024, 12:08 PM
|
0 responses
61 views
0 likes
|
Last Post
by seqadmin
04-11-2024, 12:08 PM
|
||
Started by seqadmin, 04-10-2024, 10:19 PM
|
0 responses
60 views
0 likes
|
Last Post
by seqadmin
04-10-2024, 10:19 PM
|
Leave a comment: