Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
hi, i just received an email with the new version 1.12 thingie. I installed it and I started my basecalling again and it appears that is working from the beginning now... I am hopefully
-
Version sounds fine. Don't know about other questions.
Oh... I just saw westerman's reply. I'd run the latest version... where is that downloadable from?
Leave a comment:
-
[QUOTE=joa_ds;6081]hi everybody.
So my questions: gsRunProcessor 2.0.00.22 (Build 184) -> is that Titanium ok software?
Second question, is anything different for shotgun basecalling compared to amplicon basecalling?
Leave a comment:
-
hi everybody.
I am having a hard time keeping up with analysing data those lab people keep generating...So i just need a quickie answer on my question before it start losing time again with figuring it out myself...
I configured the sequencer so that it automatically transfers everything after the imageprocessing step to our monster server to do basecalling.
My questions. This is the first shotgun experiment ever and also the first Titanium run. So I am a bit confused because I know Titanium has software updates but i dont know if I already installed it on the server.
So my questions: gsRunProcessor 2.0.00.22 (Build 184) -> is that Titanium ok software? Second question, is anything different for shotgun basecalling compared to amplicon basecalling?
I have the .cwf files here, and i was planning on hitting the 'runanalysispipe'. Will that work all right?
Everybody here keeps telling me i have to lookout for the software versions when using Titanium, but i guess there is not much i can do wrong when all i need to do is basecalling, right? I have my own tools to process the fasta/fastq files so i think i should not worry, right?
greetings from belgium
Leave a comment:
-
Does the Newbler application 'runAssembly' work in an SGE environment? The Celera Assembler at least has some docs on this (although they look complex). I don't even know how to begin to submit my runAssembly to 'the cluster'.
AFAICT, we have several 16 Gb 8 core boxes. I am trying to assemble 2 full runs of GS FLX Titanium (~1 Bn bases at ~400 bp per run).
The progress of the assembly seems to get slower and slower ... (or perhaps I'm getting increasingly impatient). I did get CA to run on this data, giving me an assembly in about 1 day (on one box).
Thanks for any hints,
Dan.Last edited by dan; 06-22-2009, 05:32 AM. Reason: Made it clear that runAssembly is distributed by Roche as a part of Newbler
Leave a comment:
-
Originally posted by cdwan View PostI've been having 'fun' trying to get the titanium off-rig analysis to work properly on a small linux cluster running Sun Grid Engine. We've had limited success.
I would be deeply grateful for anyone else's thoughts on this.
Here are some notes, in case they might help anyone else:
* The cluster consists of nine Linux servers running Centos 5.
* Each machine has 8 cores of x86_64, and 8GB of RAM.
* All nodes are connected via gigabit ethernet to a 90TB NFS share.
* The cluster is in moderate use for BLAST and other standard bioinformatic processing, and has never seen lockups or crashes before.
Environment variables that seem important to runAnalysisPipe are:
* export GS_MPIARGS="--n $NSLOTS --machinefile $TMPDIR/machines"
* export GS_LAUNCH_MODE=MPI
* export PATH=${PATH}:/opt/454/bin
I'm very curious to try this "GS_CACHEDIR", but I don't know what it does.
Note that the lines above are from my SGE job submission script. $NSLOTS and $TMPDIR/machines are created by a wrapper script and get set up based on how I submit the job. $NSLOTS is "how many parallel threads to start.". The machines files is a list of hostnames to start them on.
I found the "--progress" and "--verbose" flags to be quite useful in figuring out if processing is making progress or not.
We also encountered the hard-lockup behavior. I still have no idea of the *cause* of these lockups - but we've managed to work around. Here are my observations:
* openmpi jobs run on a single machine never finish, no matter how many threads I give them (1, 2, 4, 8, 16). I wave my hands in the direction of "16GB of RAM required".
* If I start 8 threads, four on each of two machines, those jobs run in a few hours.
* If I start more than 4 threads on any one machine, I have high odds of locking up (requiring a hard power cycle) at least one of the machines involved in that run.
* If I run threads from more than one job at a time on a particular machine, odds are high that I will lock up that machine.
* I can run 4 BLAST jobs and 4 threads of gsRunProcessor without too much contention on the same 8 core machine.
* gsRunProcessor leaves zombie processes all over the place when one of the compute nodes locks up during a run. I encounter fewer lockups if I clean those up prior to starting a run. This is made simple by the observation that I can't run two jobs on the same node anyway.
There is some correlation of the node lock-ups with heavy loads on the NFS file server - but I have yet to encounter any smoking gun with this.
Anyone else?
First of all, I want to thank you for sharing your experience. I've been searching the web for information on how to setup the GS FLX Titanium software with Sun Grid Engine and your post is the first concrete reference that I've found.
So, basically, I don't have experience with the software nor Sun Grid Engine and I'm trying to setup an "off rig" cluster with SGE (though challenge). The (would be) cluster specs are:
* 1 head node with 8 cores, x86_64, 16 GB of RAM;
* 3 nodes with 4 cores, x86_64, 8 GB of RAM;
* CentOS 5.3;
* ~ 4 TB of storage.
The GS FLX Titanium software is already installed, on all nodes, with OpenMPI support.
I would be really great if you could share any information about how to setup Sun Grid Engine with this software like a tutorial, howtos, wikis, or even any *good* documentation about setting up SGE, its architecture, etc., would be excellent!.
Regarding the NFS server lockup: you could try sending the system and kernel logs to a remote syslog and see if the (high) load triggers some sort of kernel panic, just a thought...
If I can put the cluster together, I'll be happy to share our experiences
Thank you!
Best regards,
Joao
Leave a comment:
-
Titanium off rig analysis
I've been having 'fun' trying to get the titanium off-rig analysis to work properly on a small linux cluster running Sun Grid Engine. We've had limited success.
I would be deeply grateful for anyone else's thoughts on this.
Here are some notes, in case they might help anyone else:
* The cluster consists of nine Linux servers running Centos 5.
* Each machine has 8 cores of x86_64, and 8GB of RAM.
* All nodes are connected via gigabit ethernet to a 90TB NFS share.
* The cluster is in moderate use for BLAST and other standard bioinformatic processing, and has never seen lockups or crashes before.
Environment variables that seem important to runAnalysisPipe are:
* export GS_MPIARGS="--n $NSLOTS --machinefile $TMPDIR/machines"
* export GS_LAUNCH_MODE=MPI
* export PATH=${PATH}:/opt/454/bin
I'm very curious to try this "GS_CACHEDIR", but I don't know what it does.
Note that the lines above are from my SGE job submission script. $NSLOTS and $TMPDIR/machines are created by a wrapper script and get set up based on how I submit the job. $NSLOTS is "how many parallel threads to start.". The machines files is a list of hostnames to start them on.
I found the "--progress" and "--verbose" flags to be quite useful in figuring out if processing is making progress or not.
We also encountered the hard-lockup behavior. I still have no idea of the *cause* of these lockups - but we've managed to work around. Here are my observations:
* openmpi jobs run on a single machine never finish, no matter how many threads I give them (1, 2, 4, 8, 16). I wave my hands in the direction of "16GB of RAM required".
* If I start 8 threads, four on each of two machines, those jobs run in a few hours.
* If I start more than 4 threads on any one machine, I have high odds of locking up (requiring a hard power cycle) at least one of the machines involved in that run.
* If I run threads from more than one job at a time on a particular machine, odds are high that I will lock up that machine.
* I can run 4 BLAST jobs and 4 threads of gsRunProcessor without too much contention on the same 8 core machine.
* gsRunProcessor leaves zombie processes all over the place when one of the compute nodes locks up during a run. I encounter fewer lockups if I clean those up prior to starting a run. This is made simple by the observation that I can't run two jobs on the same node anyway.
There is some correlation of the node lock-ups with heavy loads on the NFS file server - but I have yet to encounter any smoking gun with this.
Anyone else?
Leave a comment:
-
Don't think this will help with your problems but I thought I would post some notes from my experience getting MPI working on a single 8 core redhat system.
1. have had persistent problems getting the pipeline to work with more than 6 cores. Running with >6 cores leads to hard lockup, but have not had time to track this down have run several runs with 6 cores takes ~12 hours to process
2.
needed to add the following to .bashrc or similar
ulimit -l unlimited
export RIGDIR="/opt/454"
export LD_LIBRARY_PATH=/usr/lib64/openmpi/1.2.5-gcc/lib/
export PATH=$PATH:/opt/454:/usr/lib/openmpi/1.2.5-gcc/bin
export GS_LAUNCH_MODE="MPI"
export GS_MPIARGS=" --n 6 "
export GS_XML_PORT=4540
export GS_CACHEDIR=/cache
note you may need to make some changes to /etc/security/limits.conf to have the ulimit work. You will know that this is a problem if the runPipeAnalysis complains that it can only allocate 32k
the command to run the analysis then would be runAnalysisPipe R_2008_11...
this will run with 6 process (based on GS_MPIARGS)
Documentation on this was sparse so not sure if it is canonically correct but it seems to work.
Any other info on MPI for titanium?
Leave a comment:
-
We ourselves have run into this problem as well. Basically, the processes get SIGTERM'd for us because the Linux kernel is killing the processes off because all the system's memory has been consumed by them... We also had aspirations of using Sun Grid Engine with a properly configured MPI parallel environment but I don't think we can get there until we can get direct MPI usage working reliably...
Leave a comment:
-
Titanium software (MPI mode)
Is anyone running the new titanium software on an MPI cluster? I can get the software to run the verification dataset in 'multi' mode (i.e., using all of the CPUs on a single box) but when I try MPI-mode, even on a single box, then after a while program does SIGTERMs on the child processes and then stops.
Advice or comments appreciated.
Latest Articles
Collapse
-
by seqadmin
In recent years, precision medicine has become a major focus for researchers and healthcare professionals. This approach offers personalized treatment and wellness plans by utilizing insights from each person's unique biology and lifestyle to deliver more effective care. Its advancement relies on innovative technologies that enable a deeper understanding of individual variability. In a joint documentary with our colleagues at Biocompare, we examined the foundational principles of precision...-
Channel: Articles
01-27-2025, 07:46 AM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, Yesterday, 10:34 AM
|
0 responses
15 views
0 likes
|
Last Post
by seqadmin
Yesterday, 10:34 AM
|
||
Started by seqadmin, 02-03-2025, 09:07 AM
|
0 responses
19 views
0 likes
|
Last Post
by seqadmin
02-03-2025, 09:07 AM
|
||
Started by seqadmin, 01-31-2025, 08:31 AM
|
0 responses
32 views
0 likes
|
Last Post
by seqadmin
01-31-2025, 08:31 AM
|
||
Started by seqadmin, 01-24-2025, 07:35 AM
|
0 responses
81 views
0 likes
|
Last Post
by seqadmin
01-24-2025, 07:35 AM
|
Leave a comment: