Dear Colleagues,
I'm working on a disease treatment study. I got RNA-seq samples of cell-line from 7 patients and 7 normal persons (both consist of 3 males and 4 females). Each individual was given with 3 different treatments including a vehicle control. So that I got 42 RNA libraries in all.
I tried several linear models. In DESeq2 and Limma I used a design to compare between and within groups, which looks like:
Expression ~ Genotype + Genotype:Individual.nested + Genotype:Treatment
As I got 42 samples, I think I'm kind of having enough observations to estimate variance in a normal way, instead of using the assumption that genes with similar expression levels would have similar dispersions. So I also tried a mixed model using lme4 package, and using batches and indivduals as random effects. It was built by LMM instead of GLMM:
Expression ~ Genotype*Gender*Treatment + (1|Batch) + (1|Individual)
When I calculated contrasts (i.e. Treatment A effect on Normal), all these models gave highly inflated raw p values (more than 8000 significant genes). Limma and DESeq2 results had a >95% overlap and they had a > 65% overlap with lme4. This doesn't make sense for me not just because of the p value distribution. Treatment A is a proved treatment on patient. This contrast, which test for its off-target on normal people, turned out a shocked result.
I'm pretty sure the contrast was done in a correct way, involving the correct beta's. Any suggestion is really appreciated. Thanks a lot.
I'm working on a disease treatment study. I got RNA-seq samples of cell-line from 7 patients and 7 normal persons (both consist of 3 males and 4 females). Each individual was given with 3 different treatments including a vehicle control. So that I got 42 RNA libraries in all.
I tried several linear models. In DESeq2 and Limma I used a design to compare between and within groups, which looks like:
Expression ~ Genotype + Genotype:Individual.nested + Genotype:Treatment
As I got 42 samples, I think I'm kind of having enough observations to estimate variance in a normal way, instead of using the assumption that genes with similar expression levels would have similar dispersions. So I also tried a mixed model using lme4 package, and using batches and indivduals as random effects. It was built by LMM instead of GLMM:
Expression ~ Genotype*Gender*Treatment + (1|Batch) + (1|Individual)
When I calculated contrasts (i.e. Treatment A effect on Normal), all these models gave highly inflated raw p values (more than 8000 significant genes). Limma and DESeq2 results had a >95% overlap and they had a > 65% overlap with lme4. This doesn't make sense for me not just because of the p value distribution. Treatment A is a proved treatment on patient. This contrast, which test for its off-target on normal people, turned out a shocked result.
I'm pretty sure the contrast was done in a correct way, involving the correct beta's. Any suggestion is really appreciated. Thanks a lot.