Hi everyone,
I am currently working on some RNA-seq data in DESeq2 and I would like to know if I can transform a specific biological question into a DESeq2 glm contrast for which Differentially Expressed Genes (DEGs) can be called. I am not very comfortable with contrasts (and not even sure they’re the way to go, despite having read the helpful related Seqanswers threads) so I’m looking for some help!
We have an experiment with 3 immune cell Populations (say A, B, and C, with A as the reference) and 2 Conditions (a Control condition as reference, and a Stimulated condition), with 3 biological replicates per group (hence 18 samples in total).
So far we’ve analysed each cell population independently, deriving the list of DEGs between Stimulated and Control in each, and then looking at how these lists intersect (e.g. with Venn diagrams).
However, it would make much more sense to analyse all samples in one go, both statistically (more power) and biologically (as everything was done with the same protocol). In this case cell Population would be added as a new factor, and I’d also include the Population:Condition interaction. I have been experimenting with this setup:
The list of resultsNames in the model is thus:
Getting the list of overall DEGs for Stimulated versus Control is straightforward:
Writing a contrast to compare Population C versus B, irrespective of Condition, is easy too:
I’m not entirely sure of myself but it seems to me that the contrast to test the effect of the interaction of Population B and Condition Stimulated would be:
Am I right so far?
Either way, then comes the tricky part.
Our biological question is: what are the genes that are differentially expressed between Stimulated and Control in populations A and B BUT NOT differentially expressed between Stimulated and Control in population C? (this is because we know A and B drive a specific type of downstream response, while C is more of an “on-looker”, a sort of internal control if you wish)
Is this the same as asking for the glm coefficients of PopulationB.Stimulated and PopulationC.Stimulated? If yes, how does one write this up as a contrast? I haven’t been able to figure it out, despite reading the various posts about writing up the contrasts... Can it be so simple as any of the following:
...Or would it just be better to test for the overall Stimulated vs Control DEGs, and then the DEGs for each interaction with Population, and do set analysis on all that?
Thanks in advance for your help!
Best,
-- Alex
I am currently working on some RNA-seq data in DESeq2 and I would like to know if I can transform a specific biological question into a DESeq2 glm contrast for which Differentially Expressed Genes (DEGs) can be called. I am not very comfortable with contrasts (and not even sure they’re the way to go, despite having read the helpful related Seqanswers threads) so I’m looking for some help!
We have an experiment with 3 immune cell Populations (say A, B, and C, with A as the reference) and 2 Conditions (a Control condition as reference, and a Stimulated condition), with 3 biological replicates per group (hence 18 samples in total).
So far we’ve analysed each cell population independently, deriving the list of DEGs between Stimulated and Control in each, and then looking at how these lists intersect (e.g. with Venn diagrams).
However, it would make much more sense to analyse all samples in one go, both statistically (more power) and biologically (as everything was done with the same protocol). In this case cell Population would be added as a new factor, and I’d also include the Population:Condition interaction. I have been experimenting with this setup:
Code:
design = ~ Population + Condition + Population:Condition
Code:
"Intercept" "ConditionControl" "ConditionStimulated" "PopulationA" "PopulationB" "PopulationC" "ConditionControl.PopulationA" "ConditionStimulated.PopulationA" "ConditionControl.PopulationB" "ConditionStimulated.PopulationB" "ConditionControl.PopulationC" "ConditionStimulated.PopulationC"
Code:
contrast=c(“Condition”, “Stimulated”, “Control”)
Code:
contrast=c(“Population”, “C”, “B”)
Code:
contrast=list(c(“PopulationB”, “ConditionStimulated”, “PopulationB.ConditionStimulated”))
Either way, then comes the tricky part.
Our biological question is: what are the genes that are differentially expressed between Stimulated and Control in populations A and B BUT NOT differentially expressed between Stimulated and Control in population C? (this is because we know A and B drive a specific type of downstream response, while C is more of an “on-looker”, a sort of internal control if you wish)
Is this the same as asking for the glm coefficients of PopulationB.Stimulated and PopulationC.Stimulated? If yes, how does one write this up as a contrast? I haven’t been able to figure it out, despite reading the various posts about writing up the contrasts... Can it be so simple as any of the following:
Code:
contrast=list(c(“PopulationB.Stimulated”, “PopulationC.Stimulated”)) contrast=list(c(“PopulationB.Stimulated”, “PopulationC.Stimulated”, “ConditionStimulated”)) contrast=list(c(“PopulationB.Stimulated”, “PopulationC.Stimulated”, “ConditionStimulated”, “PopulationB”, “PopulationC”))
...Or would it just be better to test for the overall Stimulated vs Control DEGs, and then the DEGs for each interaction with Population, and do set analysis on all that?
Thanks in advance for your help!
Best,
-- Alex
Comment