Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • lran2008
    Member
    • Apr 2013
    • 35

    DESeq2 experimental design question

    Hi,

    I am having some problems with my DEseq2 experimental design (RNA-seq raw read counts). Basically, I have two treatment (sf, ls). Within each treatment, there are six cows with 3 time points(14,7 and 28). Time point 14 is the control. I want to find DE genes, 28 vs 14, 7 vs 14.

    I first tried a design: ~ treat + treat:cowf + treat:day

    Code:
              cow treat day cowf
    14LS2       2    ls  14    1
    14LS63     63    ls  14    2
    14LS66     66    ls  14    3
    14LS67     67    ls  14    4
    14LS73     73    ls  14    5
    14LS74     74    ls  14    6
    28LS2       2    ls  28    1
    28LS63     63    ls  28    2
    28LS66     66    ls  28    3
    28LS67     67    ls  28    4
    28LS73     73    ls  28    5
    28LS74     74    ls  28    6
    7LS2        2    ls   7    1
    7LS63      63    ls   7    2
    7LS66      66    ls   7    3
    7LS67      67    ls   7    4
    7LS73      73    ls   7    5
    7LS74      74    ls   7    6
    14SF5355 5355    sf  14    1
    14SF61     61    sf  14    2
    14SF62     62    sf  14    3
    14SF70     70    sf  14    4
    14SF71     71    sf  14    5
    14SF72     72    sf  14    6
    28SF5355 5355    sf  28    1
    28SF61     61    sf  28    2
    28SF62     62    sf  28    3
    28SF70     70    sf  28    4
    28SF71     71    sf  28    5
    28SF72     72    sf  28    6
    7SF5355  5355    sf   7    1
    7SF61      61    sf   7    2
    7SF62      62    sf   7    3
    7SF70      70    sf   7    4
    7SF71      71    sf   7    5
    7SF72      72    sf   7    6

    Here is the script to find DE genes of 28 vs 14 in ls treatment:
    Code:
    countdata <- read.csv("readcounts_matrix.csv", sep=",",stringsAsFactors=FALSE, row.names=1)
    coldata <- read.csv("coldata.csv", sep=",",stringsAsFactors=TRUE, row.names=1)
    coldata$cow <- factor(coldata$cow)
    coldata$day <- factor(coldata$day)
    coldata$cowf <- factor(rep(c(1,2,3,4,5,6),times=6)) 
    coldata$day <-relevel(coldata$day, "14")
    dds <- DESeqDataSetFromMatrix(countData = countdata, colData = coldata, design = ~ treat + treat:cowf + treat:day)
    dds <- estimateSizeFactors(dds)
    nc <- counts(dds, normalized=TRUE)
    filter <- rowSums(nc >= 10) >= 12
    dds <- dds[filter,]
    dds <- DESeq(dds,modelMatrixType="standard")
    ls_d28vsd14 <- results(dds, name = "treatls.day28")
    I am not sure if the above design is correct. Since the two treatment are independent, i used data from each treatment separately for DE analysis, which is easier as I can remove the treatment factor.

    Below is for ls treatment with design ~cow + day
    Code:
           cow treat day cowf
    14LS2    2    ls  14    1
    14LS63  63    ls  14    2
    14LS66  66    ls  14    3
    14LS67  67    ls  14    4
    14LS73  73    ls  14    5
    14LS74  74    ls  14    6
    28LS2    2    ls  28    1
    28LS63  63    ls  28    2
    28LS66  66    ls  28    3
    28LS67  67    ls  28    4
    28LS73  73    ls  28    5
    28LS74  74    ls  28    6
    7LS2     2    ls   7    1
    7LS63   63    ls   7    2
    7LS66   66    ls   7    3
    7LS67   67    ls   7    4
    7LS73   73    ls   7    5
    7LS74   74    ls   7    6
    Code:
    ls_countdata = countdata[,1:18]
    ls_coldata = coldata[1:18,]
    ls_dds <- DESeqDataSetFromMatrix(countData = ls_countdata, colData = ls_coldata, design = ~ cow + day)
    ls_dds <- estimateSizeFactors(ls_dds)
    nc <- counts(ls_dds, normalized=TRUE)
    filter <- rowSums(nc >= 10) >= 6
    ls_dds <- ls_dds[filter,]
    ls_dds$day <- relevel(ls_dds$day, "14")
    ls_dds <- DESeq(ls_dds)
    ls_res_28vs14 <- results(ls_dds,contrast=c("day", "28","14"))
    However, results from the two methods above produced different results. Could anybody explain me which one I should use? Thanks
  • Michael Love
    Senior Member
    • Jul 2013
    • 333

    #2
    If you are not familiar with interaction formula in R, I'd recommend splitting into two datasets by treatment and using the '~cow + day' design. This is straightforward to interpret: differences across day controlling for differences across cow. You can use results() to pull out 28 vs 14 and 7 vs 14.

    Comment

    • lran2008
      Member
      • Apr 2013
      • 35

      #3
      Thanks, Michael! ~cow+day is indeed much easier for me to understand.

      Comment

      Latest Articles

      Collapse

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by SEQadmin2, Today, 10:09 AM
      0 responses
      9 views
      0 reactions
      Last Post SEQadmin2  
      Started by SEQadmin2, Yesterday, 08:59 AM
      0 responses
      16 views
      0 reactions
      Last Post SEQadmin2  
      Started by SEQadmin2, 06-02-2026, 12:03 PM
      0 responses
      24 views
      0 reactions
      Last Post SEQadmin2  
      Started by SEQadmin2, 06-02-2026, 11:40 AM
      0 responses
      21 views
      0 reactions
      Last Post SEQadmin2  
      Working...