Originally posted by [email protected]
View Post
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
-
Originally posted by Himalaya View PostFirstly, I did not use human sized genome. I was doing metagenomics assembly of three known bacterial genome. Secondly, my computing time varied as I was not only the one using the HPC. If you are asking about the quality trimming of sequence reads (illumina or 454 pyrosequence), I see some people suggest seqtrim, prinseq and clean_reads for trimming. I did not do any quality trimming during my comparative study. I have also developed my own script for quality trimming. I compared my script with prinseq and clean_reads which is performing better than them on the overall. Sorry I can give the script now as I am going to publish it. After the publication I will notify soon.
Comment
-
Originally posted by [email protected] View PostI tried using the velvet formula for my RAM calculation and about 500 Gb is the answer for human sized genomes. Is that your experience too?
Originally posted by [email protected] View PostAslo, how many days approx. would it take? The HPC environment here has a max walltime of 96hrs.
Originally posted by [email protected] View PostAnother question is regarding trimming, did you use any specific software or wrote your own scripts?
Comment
-
Originally posted by acopeland View PostI won't make any claim about allpaths-lg being universally runnable, but I can say that we have been using it quite successfully on microbial and fungal projects.
To get workable binaries you will need gcc-4.3.2 or newer and boost-1.38 or newer (building these can be a bear). I can post configure commands if that's useful to anyone.
Finally, I strongly encourage anyone testing allpaths-lg to download the test data supplied by the Broad (ftp://ftp.broadinstitute.org/pub/crd....genome.tar.gz) and get this working before running your own data.
/bin/sh: PrepareAllPathsInput: not found
make:********* Error 127
can you help me? I want to run this software.
thans
Comment
-
Originally posted by francesco.vezzi View PostHi
at the end it seems that nobody is able to run ALLPATHS is that true?
F.
RunAllPathsLG PRE=<my dir. REFERENCE_NAME=refs DATA_SUBDIR=data RUN=output OVERWRITE=TRUE USE_LONG_JUMPS=False REFERENCE_FASTA=refs/reference.fasta
And I get this error:
Error: file /scratch/hpc/raquel/allpaths/refs/data/frag_reads_orig.fastb is supposed to already exist, but doesn't.
ForceAssert(IsRegularFile( *it )) at system/MiscUtil.cc:996 failed in function
int MakeMgr::RunMake(int)
Mon Apr 29 11:24:19 2013. Abort. Stopping.
Generating a backtrace...
Dump of stack:
0. CRD::exit(int), in Exit.cc:49
1. yes, in Assert.h:52
2. main, in RunAllPathsLG.cc:3134
Comment
-
allpaths is exiting in the que showing "E" in log file no error messages also
my pbs script
#!/bin/bash
#PBS -l walltime=48:00:00
#PBS -N 268_allpaths
#PBS -q workq
#PBS -l select=40:ncpus=16:mpiprocs=16
#PBS -l place=scatter:excl
#PBS -V
# comment begins with # followed by space......[IMPORTANT]
# Go to the directory from which you submitted the job
# cd $PBS_O_WORKDIR
module load all_paths-2.2
# path of all_paths
# path : /app/allpathslg
module load openmpi-1.6.4
# ulimit (stack) is needed by the allpaths program
ulimit -s 100000
# prepare data for allpaths:
PrepareAllPathsInput\
DATA_DIR=$PWD/scratch/268_allpaths\
PLOIDY=1\
IN_GROUPS_CSV=/scratch/268_allpaths/in_groups.csv\
IN_LIBS_CSV=/scratch/268_allpaths/in_libs.csv\
OVERWRITE=True\
| tee prepare.out
# Assemble data:
allpathslg\
PRE=$PWD\
DATA_SUBDIR=data\
RUN=run\
SUBDIR=test\
OVERWRITE=True\
| tee -a assemble.out
my csv files
in_groups.csv
file_name, library_name, group_name
/scratch/268_allpaths/SO_2511_268_R1.fastq, illumina, frags
/scratch/268_allpaths/SO_2511_268_R2.fastq, illumina, frags
/scratch/268_allpaths/SO_2511_268_R1.fastq.gz, illumina_short, jumping
/scratch/268_allpaths/SO_2511_268_R2.fastq.gz, illumina_short, jumping
in_libs.csv
library_name, project_name, organism_name, type, paired, frag_size, insert_size, read_orientation, genomic_start, genomic_end
illumina, test assembly, test, fragment, 1, 300bp, 480bp, inward, 0, 0
illumina, test assembly, test, fragment, 1, 300bp, 480bp, inward, 0, 0
illumina, test assembly, test, jumping, 1, , 2k, outward, 0, 0
illumina, test assembly, test, jumping, 1, , 2k, outward, 0, 0
Comment
Latest Articles
Collapse
-
by seqadmin
Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...-
Channel: Articles
04-04-2024, 04:25 PM -
-
by seqadmin
Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...-
Channel: Articles
03-22-2024, 06:39 AM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, 04-11-2024, 12:08 PM
|
0 responses
22 views
0 likes
|
Last Post
by seqadmin
04-11-2024, 12:08 PM
|
||
Started by seqadmin, 04-10-2024, 10:19 PM
|
0 responses
24 views
0 likes
|
Last Post
by seqadmin
04-10-2024, 10:19 PM
|
||
Started by seqadmin, 04-10-2024, 09:21 AM
|
0 responses
19 views
0 likes
|
Last Post
by seqadmin
04-10-2024, 09:21 AM
|
||
Started by seqadmin, 04-04-2024, 09:00 AM
|
0 responses
52 views
0 likes
|
Last Post
by seqadmin
04-04-2024, 09:00 AM
|
Comment