Seqanswers Leaderboard Ad

**Himalaya** · 10-08-2011, 09:32 PM

Originally posted by [email protected] View Post

I tried using the velvet formula for my RAM calculation and about 500 Gb is the answer for human sized genomes. Is that your experience too? Aslo, how many days approx. would it take? The HPC environment here has a max walltime of 96hrs.

Another question is regarding trimming, did you use any specific software or wrote your own scripts?

Firstly, I did not use human sized genome. I was doing metagenomics assembly of three known bacterial genome. Secondly, my computing time varied as I was not only the one using the HPC. If you are asking about the quality trimming of sequence reads (illumina or 454 pyrosequence), I see some people suggest seqtrim, prinseq and clean_reads for trimming. I did not do any quality trimming during my comparative study. I have also developed my own script for quality trimming. I compared my script with prinseq and clean_reads which is performing better than them on the overall. Sorry I can give the script now as I am going to publish it. After the publication I will notify soon.

**vyellapa** · 10-15-2011, 10:28 PM

Originally posted by Himalaya View Post

Firstly, I did not use human sized genome. I was doing metagenomics assembly of three known bacterial genome. Secondly, my computing time varied as I was not only the one using the HPC. If you are asking about the quality trimming of sequence reads (illumina or 454 pyrosequence), I see some people suggest seqtrim, prinseq and clean_reads for trimming. I did not do any quality trimming during my comparative study. I have also developed my own script for quality trimming. I compared my script with prinseq and clean_reads which is performing better than them on the overall. Sorry I can give the script now as I am going to publish it. After the publication I will notify soon.

Thank you Himalaya!

**tonybolger** · 10-16-2011, 10:58 PM

Originally posted by [email protected] View Post

I tried using the velvet formula for my RAM calculation and about 500 Gb is the answer for human sized genomes. Is that your experience too?

Memory consumption is massively impacted by read quality - trimming is strongly recommended to reduce the pain.

Originally posted by [email protected] View Post

Aslo, how many days approx. would it take? The HPC environment here has a max walltime of 96hrs.

Depends a lot on the hardware - for me, SOAP takes about 2 days, but it can be split into separate steps. I don't have experience of velvet with full-scale genomes, but i suspect you'll go well over that time.

Originally posted by [email protected] View Post

Another question is regarding trimming, did you use any specific software or wrote your own scripts?

I wrote this bad boy to do what i needed.

**erhuangzi** · 04-07-2012, 12:15 AM

Originally posted by acopeland View Post

I won't make any claim about allpaths-lg being universally runnable, but I can say that we have been using it quite successfully on microbial and fungal projects.

To get workable binaries you will need gcc-4.3.2 or newer and boost-1.38 or newer (building these can be a bear). I can post configure commands if that's useful to anyone.

Finally, I strongly encourage anyone testing allpaths-lg to download the test data supplied by the Broad (ftp://ftp.broadinstitute.org/pub/crd....genome.tar.gz) and get this working before running your own data.

I also have this question, and my gcc is gcc-4.5.2. when I run allpaths-lg using the test data;
/bin/sh: PrepareAllPathsInput: not found
make：********* Error 127
can you help me？ I want to run this software.
thans

**rdlady** · 04-29-2013, 07:27 AM

Originally posted by francesco.vezzi View Post

Hi
at the end it seems that nobody is able to run ALLPATHS is that true?

F.

I'm running it with these options:
RunAllPathsLG PRE=<my dir. REFERENCE_NAME=refs DATA_SUBDIR=data RUN=output OVERWRITE=TRUE USE_LONG_JUMPS=False REFERENCE_FASTA=refs/reference.fasta

And I get this error:

Error: file /scratch/hpc/raquel/allpaths/refs/data/frag_reads_orig.fastb is supposed to already exist, but doesn't.
ForceAssert(IsRegularFile( *it )) at system/MiscUtil.cc:996 failed in function
int MakeMgr::RunMake(int)

Mon Apr 29 11:24:19 2013. Abort. Stopping.

Generating a backtrace...

Dump of stack:

0. CRD::exit(int), in Exit.cc:49
1. yes, in Assert.h:52
2. main, in RunAllPathsLG.cc:3134

**madhubioinfo** · 06-01-2014, 05:21 AM

allpaths is exiting in the que showing "E" in log file no error messages also

my pbs script
#!/bin/bash
#PBS -l walltime=48:00:00
#PBS -N 268_allpaths
#PBS -q workq
#PBS -l select=40:ncpus=16:mpiprocs=16
#PBS -l place=scatter:excl
#PBS -V

# comment begins with # followed by space......[IMPORTANT]
# Go to the directory from which you submitted the job
# cd $PBS_O_WORKDIR

module load all_paths-2.2
# path of all_paths
# path : /app/allpathslg
module load openmpi-1.6.4

# ulimit (stack) is needed by the allpaths program
ulimit -s 100000

# prepare data for allpaths:
PrepareAllPathsInput\
DATA_DIR=$PWD/scratch/268_allpaths\
PLOIDY=1\
IN_GROUPS_CSV=/scratch/268_allpaths/in_groups.csv\
IN_LIBS_CSV=/scratch/268_allpaths/in_libs.csv\
OVERWRITE=True\
| tee prepare.out

# Assemble data:
allpathslg\
PRE=$PWD\
DATA_SUBDIR=data\
RUN=run\
SUBDIR=test\
OVERWRITE=True\
| tee -a assemble.out

my csv files

in_groups.csv

file_name, library_name, group_name
/scratch/268_allpaths/SO_2511_268_R1.fastq, illumina, frags
/scratch/268_allpaths/SO_2511_268_R2.fastq, illumina, frags
/scratch/268_allpaths/SO_2511_268_R1.fastq.gz, illumina_short, jumping
/scratch/268_allpaths/SO_2511_268_R2.fastq.gz, illumina_short, jumping

in_libs.csv
library_name, project_name, organism_name, type, paired, frag_size, insert_size, read_orientation, genomic_start, genomic_end
illumina, test assembly, test, fragment, 1, 300bp, 480bp, inward, 0, 0
illumina, test assembly, test, fragment, 1, 300bp, 480bp, inward, 0, 0
illumina, test assembly, test, jumping, 1, , 2k, outward, 0, 0
illumina, test assembly, test, jumping, 1, , 2k, outward, 0, 0

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 22 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 24 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 19 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 52 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News