Unconfigured Ad

**mastal** · 04-05-2014, 10:23 AM

I think if you re-compile velvet with 'OPENMP=1' it will be able to use more than 1 processor.

**genetics_jo** · 04-05-2014, 01:10 PM

Originally posted by mastal View Post

I think if you re-compile velvet with 'OPENMP=1' it will be able to use more than 1 processor.

It's already compiled to use multiple cores and did so the first two days of velvetg...plus used up to 880 Gb RAM during that time. Now it's only running on one core and using 660 Gb RAM. Someone said it's "trimming" now and thus not showing lots of changes in RAM or core use??

**mastal** · 04-05-2014, 01:16 PM

Where is the velvet stderr output going?
Can you tell from that whether it's making any progress or just stuck?

**genetics_jo** · 04-05-2014, 03:48 PM

Originally posted by mastal View Post

Where is the velvet stderr output going?
Can you tell from that whether it's making any progress or just stuck?

Unfortunately, I submitted the job as an SGE_Batch script and cannot see the output to determine if it's stuck. In the past when things have gone awry, velvet has simply shut down.

Biggest thing I need to know is if this length of time (>4 days) is normal for a large genome assembly?

**GenoMax** · 04-06-2014, 03:52 AM

Originally posted by genetics_jo View Post

Unfortunately, I submitted the job as an SGE_Batch script and cannot see the output to determine if it's stuck. In the past when things have gone awry, velvet has simply shut down.

Biggest thing I need to know is if this length of time (>4 days) is normal for a large genome assembly?

I am more familiar with LSF (which has "bpeek" command to look at the output of ongoing jobs). I do not think this is an analogous command (besides qstat -f -u username) available in SGE.

You can check the specific nodes where your job is running to see if velvet related processes are actively running.

If is difficult to define "normal" since all clusters have different hardware, queue limits and your dataset is unique to you. If the job is "running" (i.e. not suspended or otherwise) then best thing to do is wait.

**genetics_jo** · 04-06-2014, 08:13 AM

Thanks Genomax! I'll hold my finger off the trigger (qdel) as long as possible. My one-week reservation of the cluster runs out on Tuesday so hopefully things will resolve by then.

**AdrianP** · 04-06-2014, 02:54 PM

There may be a way to check on progress without the program output. Try doing

Code:

ls -al

every hours or so to see which file is growing and if new files are being added. This will tell you what work is being done.

**mastal** · 04-06-2014, 03:19 PM

That's a good idea, but in my experience, velvetg creates most of the output files just before the run finishes.

**genetics_jo** · 04-06-2014, 06:38 PM

Originally posted by mastal View Post

That's a good idea, but in my experience, velvetg creates most of the output files just before the run finishes.

That's also what I've observed with the previous runs of velvet. The program is still "running" but RAM use and % of processor have remained the same now for several days. Would have thought if it wasn't going to work it would have crashed?

**mastal** · 04-07-2014, 01:07 AM

I agree. I think if it's got through the part where it uses lots of memory and lots of processors without crashing it should be OK. Just keep your fingers crossed that it finishes before the time you have booked on the cluster runs out. I haven't assembled large genomes, so have no idea how long it should take, but I think it is not unusual for some assemblers to run for many days.

**genetics_jo** · 04-10-2014, 04:43 PM

One other question...I've seen some folks say the paired end fastq files need to be merged together into a single file for "shortPaired" use in Velvet...and seen some say that the two paired end files need to be kept separate and let velvet read and coordinate reads. Which one is it? For example if I have files Humulus_lane1_read1_1.fastq and Humulus_lane1_read1_2.fastq, should these two files be merged together or kept separately for velvet to work properly?

**paa6** · 04-10-2014, 10:11 PM

[QUOTE=genetics_jo;137017]Unfortunately, I submitted the job as an SGE_Batch script and cannot see the output to determine if it's stuck. In the past when things have gone awry, velvet has simply shut down.

Biggest thing I need to know is if this length of time (>4 days) is normal for a large genome assembly?[/QUOT
I have also recently used velvet for illumina reads but it took few seconds for me to generate assembly...it's a bacterial sequencing and small genome of course!!! but after seeing ur post I am doubting on my assembly time...plz suggest something!!!

**mastal** · 04-11-2014, 01:45 AM

@genetics_jo
in recent versions of velvet you can use either method, but you need to use the right parameters.

If you leave the reads in separate files, you should add the flag -separate,
so you would have

Code:

velveth .....   -fastq -shortPaired -separate read1.fastq  read2.fastq

If you don't use the '-separate' flag, then you need to produce a file where the reads are interleaved, using one of the shuffleSequences scripts that are in the contrib subdirectory in velvet.

Code:

velveth  ..... -fastq -shortPaired read1read2_shuffled.fastq

By the way, did your run finish before the allotted time ran out?

**mastal** · 04-11-2014, 01:50 AM

@paa6
If you're using a very powerful computer, and have only a relatively small number of reads for a small genome, velvet will run very quickly.

As long as it produced the right output files, it should be OK.

Topics	Statistics	Last Post
High-Resolution Sequencing Exposes Hidden Toxoplasma Diversity by SEQadmin2 Started by SEQadmin2, 07-02-2026, 11:08 AM	0 responses 11 views 0 reactions	Last Post by SEQadmin2 07-02-2026, 11:08 AM
New AI Model Captures Long-Range Genomic Signals to Improve RNA Splice Site Prediction by SEQadmin2 Started by SEQadmin2, 06-30-2026, 05:37 AM	0 responses 14 views 0 reactions	Last Post by SEQadmin2 06-30-2026, 05:37 AM
Large-Scale Protein Screen Uncovers Hidden Regulators of Alternative Polyadenylation by SEQadmin2 Started by SEQadmin2, 06-26-2026, 11:10 AM	0 responses 20 views 0 reactions	Last Post by SEQadmin2 06-26-2026, 11:10 AM
Whole-Genome Sequencing Traces Faroe Islands Ancestry to a North Atlantic Founder Population by SEQadmin2 Started by SEQadmin2, 06-17-2026, 06:09 AM	0 responses 54 views 0 reactions	Last Post by SEQadmin2 06-17-2026, 06:09 AM

Unconfigured Ad

How long does velvetg take for large genomes?

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News