**UPDATE**
I've migrated (aka copied) this question over to the biostars forum: https://www.biostars.org/p/244455/. Please look there for further discussion.
McCarthy, D.J., Chen, Y., and Smyth, G.K. (2012). Differential expression analysis of multifactor RNA-Seq experiments with respect to biological variation. Nucleic Acids Res 40, 4288–4297.
https://academic.oup.com/nar/article/40/10/4288/2411520/Differential-expression-analysis-of-multifactor
In Figure 2 of this paper, the authors show that estimating dispersion on a per-gene basis is more compatible with their data. Am I allowed to attach it here as an image? If so, I gladly will do!
I think understand broadly what is being demonstrated here (please correct me if I'm mistaken): When we estimate dispersions, that is an implicit model of the ratio of the mean to the standard deviation of each gene. Here, the authors are showing, with QQ plots, that the per-gene model describes the observed ratio better than a common dispersion value. Each dot in the plot corresponds to a gene.
I'd like to generate this figure for my own data, but I don't understand how to compute the two vectors required. I'm guessing that one might be the log likelihood after fitting the GLM?
Thanks for any light you can shed (code also gratefully appreciated, but no obligation)
I've migrated (aka copied) this question over to the biostars forum: https://www.biostars.org/p/244455/. Please look there for further discussion.
McCarthy, D.J., Chen, Y., and Smyth, G.K. (2012). Differential expression analysis of multifactor RNA-Seq experiments with respect to biological variation. Nucleic Acids Res 40, 4288–4297.
https://academic.oup.com/nar/article/40/10/4288/2411520/Differential-expression-analysis-of-multifactor
In Figure 2 of this paper, the authors show that estimating dispersion on a per-gene basis is more compatible with their data. Am I allowed to attach it here as an image? If so, I gladly will do!
I think understand broadly what is being demonstrated here (please correct me if I'm mistaken): When we estimate dispersions, that is an implicit model of the ratio of the mean to the standard deviation of each gene. Here, the authors are showing, with QQ plots, that the per-gene model describes the observed ratio better than a common dispersion value. Each dot in the plot corresponds to a gene.
I'd like to generate this figure for my own data, but I don't understand how to compute the two vectors required. I'm guessing that one might be the log likelihood after fitting the GLM?
Thanks for any light you can shed (code also gratefully appreciated, but no obligation)