Unconfigured Ad

**Simon Anders** · 07-06-2011, 01:03 AM

A few weeks ago, we have completely rewritten the DESeq vignette (manual). One of our changes was to remove everything about this ECDF plot of the variance residuals, as people kept misunderstanding its purpose (which was maybe never that clear anyway.) It is not to check quality of samples.

The point of the variance residual ECDF plots was to check whether the assumption holds well that genes of similar expression strength have similar variance, because the old DESeq version did not deal well with "variance outliers", i.e., genes with variance much stronger than similar genes. See the new vignette to learn how we now simply take the maximum of fitted value and per-gene estimate to avoid making an error here.

To judge the reproducibility of a protocol, i.e., the similarity of replicate samples, I now
recommend the following two possibilities:

(i) use the new 'estimateDispersions' function that now, by default, no longer does a local fit but a parametric fit, fitting a curve alpha = alpha_0 + alpha_1/mu on the dispersion alpha, or equivalently, a curve v = ( 1 + alpha_1 ) * mu + alpha_0 * mu^2 on the variance v. The value alpha_0 is a good measure of the overall (intensity-independent) variation between replicates, the value alpha_1 is a measure of the additional variance for weak genes. See vignette for details.

(ii) use the variance-stabilizing transformation to make a sample-clustering heatmap, as described in the vignette, to see whether your replicates are more similar than samples from different treatment groups.

Note that the new DESeq is available in the devel branch, not yet in the release branch, of Bioconductor

**labunit** · 07-06-2011, 04:57 AM

Hello Simon,
the "Package Downloads" links on the Bioconductor homepage (http://www.bioconductor.org/packages...tml/DESeq.html) are wrong. They still link to version 1.5.18 but should link to 1.5.19. Don't know wether you have any control over that.

Best,
Mark Onyango

**KellerMac** · 07-06-2011, 09:28 AM

Do I need to delete the older version of DESeq? If so where do you think it would be?

**labunit** · 07-07-2011, 12:41 AM

Hello Simon,
could you please elaborate on why you switched from the local fit to a parametric fit as a default setting? I always found your idea for a more data-driven fit very sound.

@KellerMac:
It depends on what operating system you are using. If you use Windows you can safely install the development version parallel to the release version as it will also create a new library folder. So the two do not interfere.
If you are using Linux (e.g. Ubuntu) you simply download the development sources of R into a folder of your choosing and compile it there. It won't be installed system-wide and can be started from that folder. All packages downloaded will be kept in that folder as well.
So all in all there is no need to delete the current version of DESeq from you PC.

**chrisbala** · 07-12-2011, 09:19 AM

Error: could not find function "estimateDispersion"

I'm getting:

Error: could not find function "estimateDispersion"

What have I done wrong?

I'm running R in OSX. I've had no trouble using DEseq before, just this new function.

As far as I can tell, my DEseq is up to date

**chrisbala** · 07-12-2011, 10:45 AM

oops, I think I am just struggling with how to update DEseq. I am still at DESeq 1.4 and the "update" window is not doing anything...

**chrisbala** · 07-12-2011, 11:44 AM

Ok, last one, it seems something is wrong with the files linked in bioconductor:

Bioconductor - Help

http://bioconductor.org/packages/devel/bioc/html/DESeq.html

The Bioconductor project aims to develop and share open source software for precise and repeatable analysis of biological data. We foster an inclusive and collaborative community of developers and data scientists.

Am I wrong?

**labunit** · 07-12-2011, 12:35 PM

You also need to use the development version of R (2.14) to be able to install the latest DESeq.

**chrisbala** · 07-12-2011, 01:40 PM

devel version

found the relevant thread about needing to install the development version of R as well... done.. things working for noW!

**labunit** · 12-10-2011, 08:19 AM

I am sorry to awaken this thread but I seem to have a problem with the latest Relase-Version of DESeq (1.6.1):

Whenever I try to execute the estimateDispersions function I receive the following error:

Parametric dispersion fit failed. Try a local fit and/or a pooled estimation. (See '?estimateDispersions')

Now this can only happen if the coefficients during the fitting process become negative (or at least some of them). Using the local fit kind of cures this but I still see some negative dispersion coefficients. My question therefor is: How can the coefficients become negative during fitting and how do I properly handle or interpret these?

**Simon Anders** · 12-10-2011, 01:07 PM

The problem with the fit has little to do with the negative values, because DESeq "lifts" all negative dispersion values to something slightly above zero. Rather, our new parametric fit routine still has some weaknesses that we are not yet fully sure how to straighten out. This is why the package recommends reverting to the old method if the new one fails. In practice, the difference between the two methods turned out to be not that large, anyway.

To nevertheless explain the negative values: A random variable that is distributed according to a negative binomial with mean µ and dispersion a has variance v = µ + a µ². DESeq estimates a from the data with a method-of-moments estimator, i.e., it estimates µ and v and then calculated a = (v - µ ) / µ². (I'm skipping here over a few subtleties, explained in the supplement to our paper.) Especially for low µ, it may happen that the estimate for v is larger than that for µ, and the, the estimate for the dispersion a becomes negative. On the one hand, we know that a should be positive, and hence, we need to replace all negative values with small positive ones before the test. However, I prefer to do this only after the fit, as it introduces a positive bias.

**kasutubh** · 02-13-2013, 03:05 PM

Hi..
I'm running DESeq (2.11) on R (2.25.2) on windows platform. I'm getting same error as chrisbala
Error: could not find function "estimateDispersion"
Do I need to update anything or what am I doing wrong here?
Thanks!

**Simon Anders** · 02-18-2013, 02:07 AM

Please install current versions of R and Bioconductor and try again.

Topics	Statistics	Last Post
UC San Diego Bioengineers Map Gene Function in Human Stem Cells by SEQadmin2 Started by SEQadmin2, 07-13-2026, 10:26 AM	0 responses 18 views 0 reactions	Last Post by SEQadmin2 07-13-2026, 10:26 AM
New Analysis Splits Leukemia Into 16 Epigenomic Subgroups by SEQadmin2 Started by SEQadmin2, 07-09-2026, 10:04 AM	0 responses 30 views 0 reactions	Last Post by SEQadmin2 07-09-2026, 10:04 AM
Genome-Wide CRISPR Screen Uncovers Unlikely Psoriasis Target by SEQadmin2 Started by SEQadmin2, 07-08-2026, 10:08 AM	0 responses 16 views 0 reactions	Last Post by SEQadmin2 07-08-2026, 10:08 AM
Engineered Protein Motor Takes Its First Steps Along DNA Track by SEQadmin2 Started by SEQadmin2, 07-07-2026, 11:05 AM	0 responses 34 views 0 reactions	Last Post by SEQadmin2 07-07-2026, 11:05 AM

Unconfigured Ad

Variance Estimation

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News