Hello all,
I'm sorry for my very naive and basic question, but I am trying to understand a couple of graphs: dispersion, M vs A etc, and I am a little confused about "variance" term.
When you check the formula of variance it is the average of the squared differences from the mean. So can I say, genes with high FPKM values tend to have "high variance" and also they are more dispersed relative to low expressed genes? (But I guess, high variance in high FPKM is not a problem when you plot a negative binomial distribution graph to calculate the significance of differential expression) But this also sounds odd because without thinking the math part, I am tempted to say, low expressed genes generally are not significant in differential expression analyses due to the "variability" between FPKM values I guess, there is a misconception here (for me) as I think the variability in "percentage". Moreover, this variability defines the shape of the negative binomial distribution, if it will be more squeezed or spread, used for statistical testing, right? :/
Sorry for asking about basic statistics. I would appreciate if one could explain briefly.
Thanks!
I'm sorry for my very naive and basic question, but I am trying to understand a couple of graphs: dispersion, M vs A etc, and I am a little confused about "variance" term.
When you check the formula of variance it is the average of the squared differences from the mean. So can I say, genes with high FPKM values tend to have "high variance" and also they are more dispersed relative to low expressed genes? (But I guess, high variance in high FPKM is not a problem when you plot a negative binomial distribution graph to calculate the significance of differential expression) But this also sounds odd because without thinking the math part, I am tempted to say, low expressed genes generally are not significant in differential expression analyses due to the "variability" between FPKM values I guess, there is a misconception here (for me) as I think the variability in "percentage". Moreover, this variability defines the shape of the negative binomial distribution, if it will be more squeezed or spread, used for statistical testing, right? :/
Sorry for asking about basic statistics. I would appreciate if one could explain briefly.
Thanks!
Comment