Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • DESEQ2 multiple factors + interaction analysis

    Hello
    First, I want to emphasize that reading his forum was already really extremely helpful
    overcoming some of the issues I had initially grasping the concept of designs in Deseq2.
    Nevertheless I am stuck now and could need some help:

    I have the following experimental setup (each 3 replicates):

    condition: TREATMENT vs CONTROL
    tissue: A vs B
    genotype: MUTANT vs WT

    Now, I first wanted to find all the diferentially expressed genes for each of the condition (was asked by collaborator) , e.g.

    A vs B in the wild-type and treated
    A vs B in the wild-type and treated

    and so on (it is as well a good point to compare with already published data).

    For this I created groups to simplify the analysis e.g.

    TREATMENT WT A, TREATMENT WT B, CONTROL WT A ,...

    and run the analysis with the simple design: ~group

    which gave me the following result names:

    Code:
     resultsNames(dds)
        [1] "Intercept"        "groupTREATMENT.WT.A" "groupTREATMENT.WT.B"....
    and so on.

    Which I could then compare then in the fashion:
    Code:
    cond1 <- results(dds, contrast=list("groupTREATMENT.WT.A","groupTREATMENT.WT.B"))
    I am happy till here, but now comes the part which I am not sure how to analyze best.
    If I want to figure out the interaction
    • [1] tissue A vs B + WT vs MUTANT in TREATMENT
      [2] tissue A vs B + WT vs MUTANT in CONTROL
      [3] TREATMENT vs CONTROL + WT vs MUTANT in tissue A
      [4] TREATMENT vs CONTROL + WT vs MUTANT in tissue B


    I thought that I could first re-level the genotype to make the mutant the reference and get only the genes upregulated in the WT.

    Code:
    dds$genotype <- relevel(dds$genotype, "MUTANT")
    For the design I thought the proper manner would be:

    For the 1st example

    Code:
     ~tissue + genotype + condition + genotype:tissue
    The thought was that I get the interaction between the genotype and the condition and can control for the tissue. But I am somehow on the wrong track:
    Code:
    resultsNames(dds)
    [1] "Intercept"                  "tissue_A_vs_B"                "genotype_WT_vs_MUTANT"        "condition_TREATMENT_vs_CONTROL"     "tissueA.genotypeWT"
    How would I now extract for the previous described scenario (1 and 2) extract the list of diff. exp. genes with contrast?

  • #2
    hi,

    If I understand your question, it sounds like you want to use a model with all interactions.

    You will have to turn off the LFC shrinkage (betaPrior=FALSE in the DESeq() call), as in a situation with two levels of interaction terms (first order interactions between two variables and second order interactions between three variables), shrinkage of effects becomes complicated, and we did not implement routines for this.

    By, "tissue A vs B + WT vs MUTANT in TREATMENT", do you mean, test for a difference in the interaction effect of tissue and genotype for the treatment group vs the control group?

    You should then use a design of ~ tissue*genotype*condition

    And this difference is tested with

    Code:
    results(dds, name="tissueB.genotypeMUTANT.conditionTREATMENT")
    (This requires that you relevel so that A, WT, and CONTROL are base levels of the respective factors.)

    If you want to test the interaction of tissue and genotype specific for the treatment group, that would be the interaction effect of tissue and genotype for the control group and the difference in the interaction effect for the treatment group added together:

    Code:
    results(dds, contrast=list(c("tissueB.genotypeMUTANT","tissueB.genotypeMUTANT.conditionTREATMENT")))

    For, "tissue A vs B + WT vs MUTANT in CONTROL", if you mean the interaction effect of tissue and genotype specific for the control group, this would be the effect

    Code:
    results(dds, name="tissueB.genotypeMUTANT")
    For you to visualize, it might help to examine the model matrix:

    Code:
    model.matrix(~ tissue*genotype*condition, colData(dds))
    which should be the same as the following, if you use betaPrior=FALSE:

    Code:
    attr(dds, "modelMatrix")
    Last edited by Michael Love; 01-20-2015, 08:58 AM. Reason: clarifying

    Comment


    • #3
      Thanks, that was helpful!
      Indeed I find sometimes the contrast list somehow confusing (syntax wise).
      Why would be

      test for an interaction of tissue and genotype specific for the treatment group
      --> tissueB.genotypeMUTANT.conditionTREATMENT

      whereas

      interaction effect of tissue and genotype specific for the control group
      --> tissueB.genotypeMUTANT

      Comment


      • #4
        hi,

        I've tried to clarify the above text, adding in between a third results table, which might have been the one you are interested in. This is just the nature of interactions. Interactions are additional effects for the groups which are not the reference level (or "base level"). So the tissue:genotype interaction for the control group is just the first order interaction, while the tissue:genotype interaction for the treatment group is the first order interaction plus an additional effect.

        Comment


        • #5
          I'm also struggling with the syntax of the design formula. I want to identify genes differentially expressed between two ancestries of two health statuses at each of 4 time points. I'm not sure if I should do two separate analyses or combine everything into one design:

          Ancestry: A or B
          Status: could be Control or Case
          Time_Point: 1,2,3,4

          I tried the following design to be able to compare each thing individually and also the interaction between any of the things:
          dds <- DESeqDataSetFromMatrix(countData = countData,colData = colData3,design = ~ANCESTRY+Status+Time_Point + ANCESTRY:Time_Point + Status:Time_Point)

          dds <- DESeq(dds,parallel=TRUE)

          Which results in the following:
          resultsNames(dds)
          [1] "Intercept"
          [2] "ANCESTRYA"
          [3] "ANCESTRYB"
          [4] "StatusControl"
          [5] "StatusCase"
          [6] "Time_Point1"
          [7] "Time_Point2"
          [8] "Time_Point3"
          [9] "Time_Point4"
          [10] "ANCESTRYA.Time_Point1"
          [11] "ANCESTRYB.Time_Point1"
          [12] "ANCESTRYA.Time_Point2"
          [13] "ANCESTRYB.Time_Point3"
          [14] "ANCESTRYA.Time_Point3"
          [15] "ANCESTRYB.Time_Point3"
          [16] "ANCESTRYA.Time_Point4"
          [17] "ANCESTRYB.Time_Point4"
          [18] "StatusControl.Time_Point1"
          [19] "StatusCase.Time_Point1"
          [20] "StatusControl.Time_Point2"
          [21] "StatusCase.Time_Point2"
          [22] "StatusControl.Time_Point3"
          [23] "StatusCase.Time_Point3"
          [24] "StatusControl.Time_Point4"
          [25] "StatusCase.Time_Point4"

          This way I am able to identify genes differentially expressed between one time point and another regardless of status or ancestry:
          Time1v_2<-results(dds, contrast=c("Time_Point", "Time_Point1", "Time_Point2"),parallel=TRUE)

          Time2v_3<-results(dds, contrast=c("Time_Point", "Time_Point2", "Time_Point3"),parallel=TRUE)

          I am also able to find genes differentially expressed between Status A and Status B regardless of time point or ancestry:
          ControlvCase<-results(dds, contrast=c("Status", "Control", "Case"),parallel=TRUE)

          Similarly for ancestry regardless of status or time point:
          AncestryAvB<-results(dds, contrast=c("Ancestry", "A", "B"),parallel=TRUE)

          But I get confused when I want to identify genes differentially expressed between Control and Case at each specific time point, or Ancestry A and Ancestry B at each time point. Is the following the correct syntax for such a comparison?


          For genes differentially expressed in cases and controls at a particular time:
          ControlTime2vCaseTime2<-results(dds, contrast=list(c("StatusControl.Time_Point2", "StatusCase.Time_Point2")),parallel=TRUE)

          For ancestry differences at a particular time:
          AncestryATime2vAncestryBTime2<-results(dds, contrast=list(c("ANCESTRYEA.Time_Point3_BCG_24h", "ANCESTRYAJ.Time_Point3_BCG_24h")),parallel=TRUE)

          When I run these commands, I get no error and a results table, but I want to make sure those results are for the comparison I actually want.

          I have also read about the time series option, but am unsure how it would differ from the above design.

          Many thanks for any feedback!

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Genetic Variation in Immunogenetics and Antibody Diversity
            by seqadmin



            The field of immunogenetics explores how genetic variations influence immune responses and susceptibility to disease. In a recent SEQanswers webinar, Oscar Rodriguez, Ph.D., Postdoctoral Researcher at the University of Louisville, and Ruben Martínez Barricarte, Ph.D., Assistant Professor of Medicine at Vanderbilt University, shared recent advancements in immunogenetics. This article discusses their research on genetic variation in antibody loci, antibody production processes,...
            11-06-2024, 07:24 PM
          • seqadmin
            Choosing Between NGS and qPCR
            by seqadmin



            Next-generation sequencing (NGS) and quantitative polymerase chain reaction (qPCR) are essential techniques for investigating the genome, transcriptome, and epigenome. In many cases, choosing the appropriate technique is straightforward, but in others, it can be more challenging to determine the most effective option. A simple distinction is that smaller, more focused projects are typically better suited for qPCR, while larger, more complex datasets benefit from NGS. However,...
            10-18-2024, 07:11 AM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, Today, 11:09 AM
          0 responses
          23 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, Today, 06:13 AM
          0 responses
          20 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 11-01-2024, 06:09 AM
          0 responses
          30 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 10-30-2024, 05:31 AM
          0 responses
          21 views
          0 likes
          Last Post seqadmin  
          Working...
          X