Seqanswers Leaderboard Ad

**Brian Bushnell** · 05-01-2014, 05:04 PM

The amount of time you need to wait is based on other users. It's difficult to predict when your job will run; generally, the fewer resources you need, the sooner it will run, though that's not always true.

The amount of memory mapping needs is not based on the number of reads (that only affects the time), but the size of the reference. I encourage you to avoid Solid (colorspace) data and focus on base-space data, which is more accurate and has far more relevant tools.

**GenoMax** · 05-01-2014, 05:06 PM

Originally posted by lanner View Post

I am feeling concerned because already, it has been ~2 hours since I submitted the "BWA for SOLiD" job to Galaxy, and it is still "waiting to run", whereas I have since run many other smaller jobs, and have never had to wait for my job to begin on Galaxy, except for a few minutes. Approximately how long would such a job take on Galaxy, given the size of the inputs? I just don't know what to expect, and am feeling concerned about time issues....

Sorry for a long message. If you have any advice on any of the topics, I would be glad to hear them!

I assume you are using public galaxy instance at PSU. Alignment jobs on public galaxy go into a separate queue and since there are many from around the world they generally take longer (may be up to 24 h). Do not try to delete/resubmit because your job will go to the end of the queue.

I suggest calling it a day and checking back tomorrow morning. Should be done by then.

**lanner** · 05-01-2014, 05:31 PM

Thank you both for your input!

Brian: "I encourage you to avoid Solid (colorspace) data and focus on base-space data, which is more accurate and has far more relevant tools."

Are you suggesting this for after the BWA alignment? I found this advice on another site:

"You should not convert colorspace to base space prior to aligning reads. The reason for this is that if there is an error in one of the color calls, it will effect all the downstream color calls. Instead, you should use an aligner that will do the assembly in color-space instead."

GenoMax: Yes, I am just using my (free) account on https://usegalaxy.org/. Glad to know you think it might be done by tomorrow morning, or even done at all, given its memory restraints! It is still waiting.

I am actually running a similar pipeline on four files total. Should I submit them all to BWA alignment, so they are early in the queue? Or, will that cause the memory to crash or be pushed later to the queue as me, as a user, am using more memory etc? I am just wondering how to get them through the BWA process on Galaxy the fastest - serially or in parallel?

Thanks again!

**Brian Bushnell** · 05-01-2014, 05:55 PM

Originally posted by lanner View Post

Thank you both for your input!

Brian: "I encourage you to avoid Solid (colorspace) data and focus on base-space data, which is more accurate and has far more relevant tools."

Are you suggesting this for after the BWA alignment? I found this advice on another site:

"You should not convert colorspace to base space prior to aligning reads. The reason for this is that if there is an error in one of the color calls, it will effect all the downstream color calls. Instead, you should use an aligner that will do the assembly in color-space instead."

Well, that's accurate... but no, I am suggesting that you avoid colorspace from the start, and use reads from a different platform than Solid. Converting colorspace to base-space is highly subjective (I have written a colorspace to base-space converter) and introduces many biases. And yes, with errors, it is impossible to usefully convert reads from colorspace to base-space prior to the reads being mapped, which introduces a ref-bias.

In my opinion, Solid was a poorly-implemented technology, and I believe the world would be better off if everyone pretended it never existed. It is obsolete, but even when it was still being marketed, rival technologies (Illumina, 454, Sanger) were vastly superior in terms of read lengths, error rates, and compatibility with software. As long as people keep publishing things based on Solid data, the signal-to-noise ratio of scientific literature will be adversely affected.

**lanner** · 05-01-2014, 06:04 PM

Brian: Thank you, that is really helpful for me to know. I will mention this when I present my project, especially to explain questionable results. And, I will avoid using SOLiD if I ever perform these sort of analyses in a more serious capacity (for publication purposes)!!

**Brian Bushnell** · 05-01-2014, 06:13 PM

You're welcome!

But please note that Solid can give useful results, and "I used Solid data" can't be used to explain results contradictory to what you were expecting. It's just very unreliable; so while publications have been made in the past using Solid data, I would not personally pass one today.

**lanner** · 05-01-2014, 06:15 PM

Brian: Okay, thanks for the clarification...

Topics	Statistics	Last Post
Expanding the Horizons of Cellular Research with the Single Cell Atlas by seqadmin Started by seqadmin, 04-25-2024, 11:49 AM	0 responses 19 views 0 likes	Last Post by seqadmin 04-25-2024, 11:49 AM
Genetic Variants and Diabetes Risk in Childhood Cancer Survivors by seqadmin Started by seqadmin, 04-24-2024, 08:47 AM	0 responses 19 views 0 likes	Last Post by seqadmin 04-24-2024, 08:47 AM
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 62 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM

Seqanswers Leaderboard Ad

Announcement

Reduce .fa reference file

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News