It may be a slightly outdated question now that GLM-based testing is preferred to the exact test, but I'm just very curious...

In the 2010 Genome Biology paper (http://genomebiology.com/2010/11/10/r106), formula 14 describes how in the negative binomial test, variance of condition A is estimated under the null hypothesis of the equality of per-condition means:

σˆ2_{A} = ∑{all_j∈A} [ s{j}*q0+s{j}ˆ2 * v{A}(q0) ],

where:

∑{all_j∈A} is the sum across all samples in condition A,

s{j} are scaling factors for each of these samples,

q0 is the pooled mean estimate,

v{A} is the raw variance estimate for condition A given the mean.

However, looking at the DESeq code, I'm not sure v{A} is actually estimated based on q0. Rather, it seems to me it is estimated based on q{A} - the observed mean of counts for condition A.

Am I getting it wrong or there have been some changes that I'm not aware of?

Thanks very much!

In the 2010 Genome Biology paper (http://genomebiology.com/2010/11/10/r106), formula 14 describes how in the negative binomial test, variance of condition A is estimated under the null hypothesis of the equality of per-condition means:

σˆ2_{A} = ∑{all_j∈A} [ s{j}*q0+s{j}ˆ2 * v{A}(q0) ],

where:

∑{all_j∈A} is the sum across all samples in condition A,

s{j} are scaling factors for each of these samples,

q0 is the pooled mean estimate,

v{A} is the raw variance estimate for condition A given the mean.

However, looking at the DESeq code, I'm not sure v{A} is actually estimated based on q0. Rather, it seems to me it is estimated based on q{A} - the observed mean of counts for condition A.

Am I getting it wrong or there have been some changes that I'm not aware of?

Thanks very much!

## Comment