Unconfigured Ad

**dblyons** · 04-19-2011, 09:51 PM

what will you be doing with ngs data? I'm curious because I am interested in the intersection of these disciplines, too, and am happy to share the limited knowledge I have.
- Dave

**Simon Anders** · 04-19-2011, 10:45 PM

Allow me to curb your enthusiasm with a warning: From your point of view, there are two types of HTS projects:

(a) You want to do something other people have done before and published about, and for which established and well-documented work-flows and ready-to-use tools exist. Then, you "just" need to learn enough command line fu to be able to follow these workflows. In my experience, this takes the average "not computer-science" person a couple of months, provided they have somebody around to ask questions to.

(b) You realize that by just doing what others have done before you won't get that PNAS paper and hence want to ask a novel kind of question which requires a custom-style of analysis.

Judging from the questions I get about our HTS analysis tool (DESeq), the majority of these projects work as follows:

Step 1: The PI has a great idea and tells his still enthusiastic wet-lab PhD student to order supplies for the experiment.

Step 2: The PhD student, advised by an experienced wet-lab post-doc, spends a year or so doing the experiments and preparing the samples for sequencing.

Step 3: The core facility performs the sequencing and returns several huge data files to the PhD student.

Step 4: The student struggles for several months to perform alignment and initial analysis. As there is no bioinformatician in his group, he spends many months asking people increasingly far away for help, but finally succeeds in getting a list of SNPs, a table with gene expressions, a list of transcription factor binding regions or something like this.

Step 5: The student would now like to compare these results between his different samples or tissue types. After some searching he realizes that there were many tutorials online for the initial analysis of Step 4, but no help whatsoever for Step 5.

Step 6: Half a year passes with many plots and figures being made but no progress whatsoever.

Step 7: The PI meets a biostatistician who might be willing to help.

Step 8: The biostatistician explains that an analysis is impossible because the initial experimental design was flawed and quotes R. A. Fisher: "To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of."

Step 9: The PI may have learned to ask for statistical advice before starting an experiment, but the PhD student is screwed.

Sorry for this rant, but I get so many questions from poor PhD students who have spend years working on a project that was doomed from the beginning because their PI failed to realize that HTS is not just the same as single-gene biology but larger. So, I thought, this here might be a good place to put a warning to intrepid newcomers about the dangers ahead.

I'd be interested to hear how the others feel. Am I too pessimistic, or would you agree that a group, in which not a single researcher knows how to use a command line or how to write a simple Perl or Python script should rather not attempt HTS analysis unless they have a bioinformatician to collaborate with?

Topics	Statistics	Last Post
High-Resolution Sequencing Exposes Hidden Toxoplasma Diversity by SEQadmin2 Started by SEQadmin2, 07-02-2026, 11:08 AM	0 responses 23 views 0 reactions	Last Post by SEQadmin2 07-02-2026, 11:08 AM
New AI Model Captures Long-Range Genomic Signals to Improve RNA Splice Site Prediction by SEQadmin2 Started by SEQadmin2, 06-30-2026, 05:37 AM	0 responses 23 views 0 reactions	Last Post by SEQadmin2 06-30-2026, 05:37 AM
Large-Scale Protein Screen Uncovers Hidden Regulators of Alternative Polyadenylation by SEQadmin2 Started by SEQadmin2, 06-26-2026, 11:10 AM	0 responses 23 views 0 reactions	Last Post by SEQadmin2 06-26-2026, 11:10 AM
Whole-Genome Sequencing Traces Faroe Islands Ancestry to a North Atlantic Founder Population by SEQadmin2 Started by SEQadmin2, 06-17-2026, 06:09 AM	0 responses 55 views 0 reactions	Last Post by SEQadmin2 06-17-2026, 06:09 AM

Unconfigured Ad

evo-devo-genomics

Comment

Comment

Latest Articles

ad_right_rmr

News