Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • ebioman
    Member
    • Aug 2013
    • 41

    DESEQ2 multiple factors + interaction analysis

    Hello
    First, I want to emphasize that reading his forum was already really extremely helpful
    overcoming some of the issues I had initially grasping the concept of designs in Deseq2.
    Nevertheless I am stuck now and could need some help:

    I have the following experimental setup (each 3 replicates):

    condition: TREATMENT vs CONTROL
    tissue: A vs B
    genotype: MUTANT vs WT

    Now, I first wanted to find all the diferentially expressed genes for each of the condition (was asked by collaborator) , e.g.

    A vs B in the wild-type and treated
    A vs B in the wild-type and treated

    and so on (it is as well a good point to compare with already published data).

    For this I created groups to simplify the analysis e.g.

    TREATMENT WT A, TREATMENT WT B, CONTROL WT A ,...

    and run the analysis with the simple design: ~group

    which gave me the following result names:

    Code:
     resultsNames(dds)
        [1] "Intercept"        "groupTREATMENT.WT.A" "groupTREATMENT.WT.B"....
    and so on.

    Which I could then compare then in the fashion:
    Code:
    cond1 <- results(dds, contrast=list("groupTREATMENT.WT.A","groupTREATMENT.WT.B"))
    I am happy till here, but now comes the part which I am not sure how to analyze best.
    If I want to figure out the interaction
    • [1] tissue A vs B + WT vs MUTANT in TREATMENT
      [2] tissue A vs B + WT vs MUTANT in CONTROL
      [3] TREATMENT vs CONTROL + WT vs MUTANT in tissue A
      [4] TREATMENT vs CONTROL + WT vs MUTANT in tissue B


    I thought that I could first re-level the genotype to make the mutant the reference and get only the genes upregulated in the WT.

    Code:
    dds$genotype <- relevel(dds$genotype, "MUTANT")
    For the design I thought the proper manner would be:

    For the 1st example

    Code:
     ~tissue + genotype + condition + genotype:tissue
    The thought was that I get the interaction between the genotype and the condition and can control for the tissue. But I am somehow on the wrong track:
    Code:
    resultsNames(dds)
    [1] "Intercept"                  "tissue_A_vs_B"                "genotype_WT_vs_MUTANT"        "condition_TREATMENT_vs_CONTROL"     "tissueA.genotypeWT"
    How would I now extract for the previous described scenario (1 and 2) extract the list of diff. exp. genes with contrast?
  • Michael Love
    Senior Member
    • Jul 2013
    • 333

    #2
    hi,

    If I understand your question, it sounds like you want to use a model with all interactions.

    You will have to turn off the LFC shrinkage (betaPrior=FALSE in the DESeq() call), as in a situation with two levels of interaction terms (first order interactions between two variables and second order interactions between three variables), shrinkage of effects becomes complicated, and we did not implement routines for this.

    By, "tissue A vs B + WT vs MUTANT in TREATMENT", do you mean, test for a difference in the interaction effect of tissue and genotype for the treatment group vs the control group?

    You should then use a design of ~ tissue*genotype*condition

    And this difference is tested with

    Code:
    results(dds, name="tissueB.genotypeMUTANT.conditionTREATMENT")
    (This requires that you relevel so that A, WT, and CONTROL are base levels of the respective factors.)

    If you want to test the interaction of tissue and genotype specific for the treatment group, that would be the interaction effect of tissue and genotype for the control group and the difference in the interaction effect for the treatment group added together:

    Code:
    results(dds, contrast=list(c("tissueB.genotypeMUTANT","tissueB.genotypeMUTANT.conditionTREATMENT")))

    For, "tissue A vs B + WT vs MUTANT in CONTROL", if you mean the interaction effect of tissue and genotype specific for the control group, this would be the effect

    Code:
    results(dds, name="tissueB.genotypeMUTANT")
    For you to visualize, it might help to examine the model matrix:

    Code:
    model.matrix(~ tissue*genotype*condition, colData(dds))
    which should be the same as the following, if you use betaPrior=FALSE:

    Code:
    attr(dds, "modelMatrix")
    Last edited by Michael Love; 01-20-2015, 08:58 AM. Reason: clarifying

    Comment

    • ebioman
      Member
      • Aug 2013
      • 41

      #3
      Thanks, that was helpful!
      Indeed I find sometimes the contrast list somehow confusing (syntax wise).
      Why would be

      test for an interaction of tissue and genotype specific for the treatment group
      --> tissueB.genotypeMUTANT.conditionTREATMENT

      whereas

      interaction effect of tissue and genotype specific for the control group
      --> tissueB.genotypeMUTANT

      Comment

      • Michael Love
        Senior Member
        • Jul 2013
        • 333

        #4
        hi,

        I've tried to clarify the above text, adding in between a third results table, which might have been the one you are interested in. This is just the nature of interactions. Interactions are additional effects for the groups which are not the reference level (or "base level"). So the tissue:genotype interaction for the control group is just the first order interaction, while the tissue:genotype interaction for the treatment group is the first order interaction plus an additional effect.

        Comment

        • bpb9
          Member
          • Aug 2012
          • 24

          #5
          I'm also struggling with the syntax of the design formula. I want to identify genes differentially expressed between two ancestries of two health statuses at each of 4 time points. I'm not sure if I should do two separate analyses or combine everything into one design:

          Ancestry: A or B
          Status: could be Control or Case
          Time_Point: 1,2,3,4

          I tried the following design to be able to compare each thing individually and also the interaction between any of the things:
          dds <- DESeqDataSetFromMatrix(countData = countData,colData = colData3,design = ~ANCESTRY+Status+Time_Point + ANCESTRY:Time_Point + Status:Time_Point)

          dds <- DESeq(dds,parallel=TRUE)

          Which results in the following:
          resultsNames(dds)
          [1] "Intercept"
          [2] "ANCESTRYA"
          [3] "ANCESTRYB"
          [4] "StatusControl"
          [5] "StatusCase"
          [6] "Time_Point1"
          [7] "Time_Point2"
          [8] "Time_Point3"
          [9] "Time_Point4"
          [10] "ANCESTRYA.Time_Point1"
          [11] "ANCESTRYB.Time_Point1"
          [12] "ANCESTRYA.Time_Point2"
          [13] "ANCESTRYB.Time_Point3"
          [14] "ANCESTRYA.Time_Point3"
          [15] "ANCESTRYB.Time_Point3"
          [16] "ANCESTRYA.Time_Point4"
          [17] "ANCESTRYB.Time_Point4"
          [18] "StatusControl.Time_Point1"
          [19] "StatusCase.Time_Point1"
          [20] "StatusControl.Time_Point2"
          [21] "StatusCase.Time_Point2"
          [22] "StatusControl.Time_Point3"
          [23] "StatusCase.Time_Point3"
          [24] "StatusControl.Time_Point4"
          [25] "StatusCase.Time_Point4"

          This way I am able to identify genes differentially expressed between one time point and another regardless of status or ancestry:
          Time1v_2<-results(dds, contrast=c("Time_Point", "Time_Point1", "Time_Point2"),parallel=TRUE)

          Time2v_3<-results(dds, contrast=c("Time_Point", "Time_Point2", "Time_Point3"),parallel=TRUE)

          I am also able to find genes differentially expressed between Status A and Status B regardless of time point or ancestry:
          ControlvCase<-results(dds, contrast=c("Status", "Control", "Case"),parallel=TRUE)

          Similarly for ancestry regardless of status or time point:
          AncestryAvB<-results(dds, contrast=c("Ancestry", "A", "B"),parallel=TRUE)

          But I get confused when I want to identify genes differentially expressed between Control and Case at each specific time point, or Ancestry A and Ancestry B at each time point. Is the following the correct syntax for such a comparison?


          For genes differentially expressed in cases and controls at a particular time:
          ControlTime2vCaseTime2<-results(dds, contrast=list(c("StatusControl.Time_Point2", "StatusCase.Time_Point2")),parallel=TRUE)

          For ancestry differences at a particular time:
          AncestryATime2vAncestryBTime2<-results(dds, contrast=list(c("ANCESTRYEA.Time_Point3_BCG_24h", "ANCESTRYAJ.Time_Point3_BCG_24h")),parallel=TRUE)

          When I run these commands, I get no error and a results table, but I want to make sure those results are for the comparison I actually want.

          I have also read about the time series option, but am unsure how it would differ from the above design.

          Many thanks for any feedback!

          Comment

          Latest Articles

          Collapse

          • SEQadmin2
            Nine Things a Sample Prep Scientist Thinks About Before Sequencing
            by SEQadmin2


            I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.


            Here are nine questions we think about, in roughly the order they matter, before...
            06-18-2026, 07:11 AM
          • SEQadmin2
            From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
            by SEQadmin2


            Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


            The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
            ...
            06-02-2026, 10:05 AM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by SEQadmin2, 06-17-2026, 06:09 AM
          0 responses
          32 views
          0 reactions
          Last Post SEQadmin2  
          Started by SEQadmin2, 06-09-2026, 11:58 AM
          0 responses
          97 views
          0 reactions
          Last Post SEQadmin2  
          Started by SEQadmin2, 06-05-2026, 10:09 AM
          0 responses
          117 views
          0 reactions
          Last Post SEQadmin2  
          Started by SEQadmin2, 06-04-2026, 08:59 AM
          0 responses
          109 views
          0 reactions
          Last Post SEQadmin2  
          Working...