Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • mevers
    Junior Member
    • Aug 2013
    • 2

    DESeq2: Difference between condition+type vs. 3 conditions

    Dear all.

    I am unsure about how to use DESeq2 in the case of 3 conditions vs. 2 conditions + 2 types. Assuming I have the following design table
    Code:
              condition    type
    sample1   A            T1
    sample2   A            T1
    sample3   B            T2
    sample4   B            T2
    sample5   A            T2
    sample6   A            T2
    I am unsure about how this would be treated differently from
    Code:
              condition
    sample1   A:T1
    sample2   A:T1
    sample3   B:T2
    sample4   B:T2
    sample5   A:T2
    sample6   A:T2
    The second design table describes a 3-condition scenario.

    Now, obviously one would be interested in a detailed analysis of the counts for
    1. A:T2 vs. B:T2 (since they have the same type but a different conditions), and potentially
    2. A:T2 vs. A:T1 (since they have the same condition but different types).


    Question 1: If I reduce the problem to that of a 3-condition no-type design table, is this correctly taken into account?

    I know I would have to re-factor the columns of the 2nd matrix to reflect the correct order of fold changes that I want to calculate. So for example following re-factoring the levels as
    Code:
    levels=c("A:T2","B:T2","A:T1")
    and performing a DESeq2 analysis
    Code:
    dds<-DESeqDataSetFromMatrix(countData = countData, colData = design, design = ~ condition + type);
    dds<-DESeq(dds);
    Question 2: I could calculate the fold changes of B:T2 wrt A:T2 and A:T1 wrt A:T2, is this correct?
    I do get some issues with non-convergent dispersion fits, which I can get around if I call estimateDispersions manually with fitType="local".

    Question 3: But what happens in the case of the 1st condition+type table? I am confused as to the output of DESeq2. What role does the type play in the differential expression analysis and/or the dispersion fitting?

    Any help on this issue would be greatly appreciated.

    Regards,
    Maurits
    Last edited by mevers; 08-22-2013, 01:38 AM. Reason: Typo
  • Simon Anders
    Senior Member
    • Feb 2010
    • 995

    #2
    In your first table, the type is always the same. Is this a typo? If not, I'm not sure I understand your question.

    Comment

    • mevers
      Junior Member
      • Aug 2013
      • 2

      #3
      Hi Simon.

      Yes, that was a silly mistake, you are absolutely right. I've changed it now in the original post. It should have read
      Code:
      type=c("T1","T1","T2","T2","T2","T2")
      Cheers,
      Maurits

      Comment

      • Michael Love
        Senior Member
        • Jul 2013
        • 333

        #4
        Question 1:

        You can technically represent it either way, although I would recommend to keep the variables separate for the following reason: if you combined the variables (as in "A:T1"), then you cannot make a clean B vs A comparison. Instead you have a B:T2 vs A:T1 comparison which mixes the effect of B vs A and T2 vs T1.

        Question 2:

        Note that fitType is also an argument for DESeq()

        Question 3:

        Both variables are used for finding fitted means (mu in the GLM formula given in the reference manual and vignette). And then the fitted means mu is used to estimate the dispersion. Dispersion is a measure of how far the counts deviate from the mu for that sample. Both variables will have fitted coefficients (betas in the GLM formula) and you can extract tests for each variable of the null hypothesis that the coefficients are equal to zero. By default the results for the last variable is provided by results(). For more, see the section in the vignette on "Multi-factor designs" and the man page for results().

        Comment

        Latest Articles

        Collapse

        • SEQadmin2
          Nine Things a Sample Prep Scientist Thinks About Before Sequencing
          by SEQadmin2


          I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

          Here are nine questions we think about, in roughly the order they matter, before...
          06-18-2026, 07:11 AM
        • SEQadmin2
          From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
          by SEQadmin2


          Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


          The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
          ...
          06-02-2026, 10:05 AM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by SEQadmin2, Yesterday, 11:10 AM
        0 responses
        7 views
        0 reactions
        Last Post SEQadmin2  
        Started by SEQadmin2, 06-17-2026, 06:09 AM
        0 responses
        42 views
        0 reactions
        Last Post SEQadmin2  
        Started by SEQadmin2, 06-09-2026, 11:58 AM
        0 responses
        104 views
        0 reactions
        Last Post SEQadmin2  
        Started by SEQadmin2, 06-05-2026, 10:09 AM
        0 responses
        125 views
        0 reactions
        Last Post SEQadmin2  
        Working...