Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Advanced interaction contrast in DESeq2

    Hi everyone,



    I am currently working on some RNA-seq data in DESeq2 and I would like to know if I can transform a specific biological question into a DESeq2 glm contrast for which Differentially Expressed Genes (DEGs) can be called. I am not very comfortable with contrasts (and not even sure they’re the way to go, despite having read the helpful related Seqanswers threads) so I’m looking for some help!



    We have an experiment with 3 immune cell Populations (say A, B, and C, with A as the reference) and 2 Conditions (a Control condition as reference, and a Stimulated condition), with 3 biological replicates per group (hence 18 samples in total).

    So far we’ve analysed each cell population independently, deriving the list of DEGs between Stimulated and Control in each, and then looking at how these lists intersect (e.g. with Venn diagrams).



    However, it would make much more sense to analyse all samples in one go, both statistically (more power) and biologically (as everything was done with the same protocol). In this case cell Population would be added as a new factor, and I’d also include the Population:Condition interaction. I have been experimenting with this setup:
    Code:
    design = ~ Population + Condition + Population:Condition
    The list of resultsNames in the model is thus:
    Code:
    "Intercept"                  
    "ConditionControl"                   "ConditionStimulated"               
    "PopulationA"                          "PopulationB"           "PopulationC"
    "ConditionControl.PopulationA" "ConditionStimulated.PopulationA"
    "ConditionControl.PopulationB" "ConditionStimulated.PopulationB" 
    "ConditionControl.PopulationC" "ConditionStimulated.PopulationC"
    Getting the list of overall DEGs for Stimulated versus Control is straightforward:
    Code:
    contrast=c(“Condition”, “Stimulated”, “Control”)
    Writing a contrast to compare Population C versus B, irrespective of Condition, is easy too:
    Code:
    contrast=c(“Population”, “C”, “B”)
    I’m not entirely sure of myself but it seems to me that the contrast to test the effect of the interaction of Population B and Condition Stimulated would be:
    Code:
    contrast=list(c(“PopulationB”, “ConditionStimulated”, “PopulationB.ConditionStimulated”))
    Am I right so far?



    Either way, then comes the tricky part.

    Our biological question is: what are the genes that are differentially expressed between Stimulated and Control in populations A and B BUT NOT differentially expressed between Stimulated and Control in population C? (this is because we know A and B drive a specific type of downstream response, while C is more of an “on-looker”, a sort of internal control if you wish)

    Is this the same as asking for the glm coefficients of PopulationB.Stimulated and PopulationC.Stimulated? If yes, how does one write this up as a contrast? I haven’t been able to figure it out, despite reading the various posts about writing up the contrasts... Can it be so simple as any of the following:
    Code:
    contrast=list(c(“PopulationB.Stimulated”, “PopulationC.Stimulated”))
    contrast=list(c(“PopulationB.Stimulated”, “PopulationC.Stimulated”, “ConditionStimulated”))
    contrast=list(c(“PopulationB.Stimulated”, “PopulationC.Stimulated”, “ConditionStimulated”, “PopulationB”, “PopulationC”))


    ...Or would it just be better to test for the overall Stimulated vs Control DEGs, and then the DEGs for each interaction with Population, and do set analysis on all that?



    Thanks in advance for your help!

    Best,

    -- Alex

  • #2
    hi Alex,

    In the latest DESeq2 version, 1.8, I have some new instructions for users who might have a difficult time setting up such contrasts.

    Can you look over the first code chunk in Section 3.3 Interactions:

    The Bioconductor project aims to develop and share open source software for precise and repeatable analysis of biological data. We foster an inclusive and collaborative community of developers and data scientists.


    This makes comparisons of condition effects in each group much easier to perform.

    For your other question:

    "what are the genes that are differentially expressed between Stimulated and Control in populations A and B BUT NOT differentially expressed between Stimulated and Control in population C?"

    There is not a way to build such a set with one results() call. But you can build each table individually and then look at the union() of the first two, setdiff() the last table.

    The last table can be constructed by specifying a threshold for LFC, say log2 fold change < 2, and then using lfcThreshold=2 and altHypothesis="lessAbs". You need to come up with a threshold value which makes sense for your system, and you have to turn off betaPrior for this comparison. I'd recommend inspecting how this changes with plotMA(res). And see the DESeq2 paper for more details on the meaning of this threshold test.
    Last edited by Michael Love; 04-28-2015, 11:12 AM.

    Comment


    • #3
      Hi Michael,


      Thank you very much for your quick reply!!!


      Originally posted by Michael Love View Post
      In the latest DESeq2 version, 1.8, I have some new instructions for users who might have a difficult time setting up such contrasts.

      Can you look over the first code chunk in Section 3.3 Interactions:

      The Bioconductor project aims to develop and share open source software for precise and repeatable analysis of biological data. We foster an inclusive and collaborative community of developers and data scientists.


      This makes comparisons of condition effects in each group much easier to perform.
      => This documentation is exactly what I needed! It makes things so much clearer... I'll upgrade our version of R, Bioconductor and DESeq2 just to be sure and I'll try this out. Thank you!



      Originally posted by Michael Love View Post
      For your other question:

      "what are the genes that are differentially expressed between Stimulated and Control in populations A and B BUT NOT differentially expressed between Stimulated and Control in population C?"

      There is not a way to build such a set with one results() call. But you can build each table individually and then look at the union() of the first two, setdiff() the last table.

      The last table can be constructed by specifying a threshold for LFC, say log2 fold change < 2, and then using lfcThreshold=2 and altHypothesis="lessAbs". You need to come up with a threshold value which makes sense for your system, and you have to turn off betaPrior for this comparison. I'd recommend inspecting how this changes with plotMA(res). And see the DESeq2 paper for more details on the meaning of this threshold test.
      => This is essentially what I had in mind, but doing it within DESeq2 rather than downstream is certainly much better from a statistical point of view. I'll try this as soon as possible!


      Thank you again for your help! I may have further questions later, we'll see...

      Best regards,

      -- Alex

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Best Practices for Single-Cell Sequencing Analysis
        by seqadmin



        While isolating and preparing single cells for sequencing was historically the bottleneck, recent technological advancements have shifted the challenge to data analysis. This highlights the rapidly evolving nature of single-cell sequencing. The inherent complexity of single-cell analysis has intensified with the surge in data volume and the incorporation of diverse and more complex datasets. This article explores the challenges in analysis, examines common pitfalls, offers...
        06-06-2024, 07:15 AM
      • seqadmin
        Latest Developments in Precision Medicine
        by seqadmin



        Technological advances have led to drastic improvements in the field of precision medicine, enabling more personalized approaches to treatment. This article explores four leading groups that are overcoming many of the challenges of genomic profiling and precision medicine through their innovative platforms and technologies.

        Somatic Genomics
        “We have such a tremendous amount of genetic diversity that exists within each of us, and not just between us as individuals,”...
        05-24-2024, 01:16 PM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, 06-17-2024, 06:54 AM
      0 responses
      11 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 06-14-2024, 07:24 AM
      0 responses
      22 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 06-13-2024, 08:58 AM
      0 responses
      17 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 06-12-2024, 02:20 PM
      0 responses
      20 views
      0 likes
      Last Post seqadmin  
      Working...
      X