Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • bioconductor design question

    We recently commissioned some RNASeq work and I am getting my first taste of using bioconductor. This is probably a bit of a silly question for all you experts, but I just cannot seem to figure it out: Our experiment involved two cell lines that were each treated with four compounds. Each treatment was done in triplicate and there are untreated controls in triplicate as well. We are interested in comparing differential expression between the treatments vs. control in each cell line as well as comparing the same treatment between the two cell lines.

    My coldata table looks something like this:
    HTML Code:
              cell       compound    rep
    c1c1_1    c1         c1          1
    c1c1_2    c1         c1          2
    c1c1_3    c1         c1          3
    c1c2_1    c1         c2          1
    ...
    c2c3_3    c2         c3          3
    c2c4_1    c2         c4          1
    c2c4_2    c2         c4          2
    c2c4_3    c2         c4          3
    c2cc_1    c2         cc          1
    c2cc_2    c2         cc          2
    c2cc_3    c2         cc          3
    Now to my questions:
    1. Does it make sense to work with both cell lines in the same DESeqDataSet object?
    2. If yes, would '~cell + cell:compound' be a reasonable design?
    3. When comparing results between cell lines, I assume I need to compare fold-changes. Are there methods in bioconductor that can compare two DESeqResults objects?


    Any feedback, pointers, or advice greatly appreciated!
    Last edited by sunkid; 11-14-2014, 10:38 AM. Reason: typo

  • #2
    1. Yes
    2. ~cell:compound + compound or ~cell*compound would make more sense. You may also find it more convenient to just make 10 groups of cell:compound, fit ~group, and then use contrasts. The results will be the same, it's just a matter of what you find easier.
    3. You fit a model with both cell-types, so you can directly test for differences between them. Don't test by fold-change, that'd be inaccurate.

    Comment


    • #3
      Thanks for the reply. Much appreciated!

      Could you please elaborate a little on the third bullet? Is this what you mean, for example:
      HTML Code:
      > design(expmat) <- ~cell*compound
      > dds <- DESeq(expmat)
      estimating size factors
      estimating dispersions
      gene-wise dispersion estimates
      mean-dispersion relationship
      final dispersion estimates
      fitting model and testing
      > res <- results(dds, contrast=list("cellc1.compoundc3", "cellc2.compoundc3"))
      Isn't this simply comparing absolute expression levels of genes in the two cells as affected by compound c3 without taking base levels of expression into account?

      Comment


      • #4
        No, all of the possible changes have already been fit, you're just comparing the coefficients. So the baseline cell c2 vs c1 changes have already been accounted for in that comparison.

        Comment


        • #5
          Thanks again! I think I have some reading to do... any advice on a good starting point?

          Comment


          • #6
            Sorry to be a bit dense, but I am thoroughly confused, it seems

            Here is what I did:
            HTML Code:
            > design(expmat) <- ~cell*compound
            > dds <- DESeq(expmat)
            estimating size factors
            estimating dispersions
            gene-wise dispersion estimates
            mean-dispersion relationship
            final dispersion estimates
            fitting model and testing
            > res <- results(dds, contrast=list("cellc1.compoundc1", "cellc2.compoundc1"))
            > res <- res[!is.na(res$padj) & res$padj < 0.001, ]
            > o <- order(res$log2FoldChange)
            > res[o, ]
            log2 fold change (MAP): cellc1.compoundc1 vs cellc2.compoundc1 
            Wald test p-value: cellc1.compoundc1 vs cellc2.compoundc1 
            DataFrame with 254 rows and 6 columns
                    baseMean log2FoldChange      lfcSE       stat       pvalue         padj
                   <numeric>      <numeric>  <numeric>  <numeric>    <numeric>    <numeric>
            VASN    149.8294      -1.181500 0.22067214  -5.354097 8.598481e-08 1.103503e-05
            CCL20   110.8390      -1.173693 0.14111207  -8.317450 8.987607e-17 8.162821e-14
            GIMAP4 1535.1766      -1.043297 0.23876844  -4.369495 1.245343e-05 6.449020e-04
            PPAP2B  730.2799      -1.036536 0.06861464 -15.106631 1.464377e-51 1.728989e-47
            AMIGO2  169.6221      -1.016898 0.11045060  -9.206812 3.359549e-20 5.666600e-17
            ...          ...            ...        ...        ...          ...          ...
            SLIT2  130.18488       1.109605  0.1802282   6.156670 7.429022e-10 2.039871e-07
            LMO2   308.49523       1.206971  0.2256925   5.347857 8.900159e-08 1.129937e-05
            ABCB1   35.54997       1.261060  0.2257014   5.587293 2.306362e-08 3.583055e-06
            KYNU   574.17227       1.309038  0.1550106   8.444829 3.044912e-17 2.995940e-14
            RAB43   78.84993       1.346877  0.2333998   5.770686 7.894942e-09 1.412357e-06
            > plotCounts(dds, gene="VASN", intgroup = c("cell", "compound"))
            and I get


            What am I missing?

            Comment


            • #7
              Sorry to bump this, but I am still confused about the suggested design and what the comparison above actually does.

              FWIW, we are interested both in genes that are differentially expressed in each of the cell types when treated with our compounds (i.e. a comparison between the control and the treated samples) as well as those genes that are differentially expressed in BOTH cell lines treated with the same compound.

              Not to complicate things too much, but the compounds actually have overlapping enzyme inhibition profiles. C1 inhibits E1, C2 inhibits E1 and E2, and C3 inhibits just E2, for example. Hence, we are also interested in comparing the differential expression of genes (compound vs. control) between compounds.

              Comment


              • #8
                answered here: https://support.bioconductor.org/p/62973/

                Comment


                • #9
                  Originally posted by Michael Love View Post
                  Yes, and thank you very much again!

                  Comment


                  • #10
                    If in the original question, we have more than 2 cell lines, and we want to compare the "global" differences between conditions c1 and c2 (still taking into account the fact that we have multiple cell lines), would our design be something like what dpryan mentioned (cell + condition + cell:condition) and in our contrast use list(conditionc1, conditionc2)?

                    Thanks,
                    Praful

                    Comment


                    • #11
                      Yes that would work or equivalently contrast=c("condition","c1","c2") for the global c1 vs c2 comparison.

                      Comment


                      • #12
                        Thank you for the prompt reply.

                        Comment

                        Latest Articles

                        Collapse

                        • seqadmin
                          Non-Coding RNA Research and Technologies
                          by seqadmin




                          Non-coding RNAs (ncRNAs) do not code for proteins but play important roles in numerous cellular processes including gene silencing, developmental pathways, and more. There are numerous types including microRNA (miRNA), long ncRNA (lncRNA), circular RNA (circRNA), and more. In this article, we discuss innovative ncRNA research and explore recent technological advancements that improve the study of ncRNAs.

                          Nobel Prize for MicroRNA Discovery
                          This week,...
                          10-07-2024, 08:07 AM
                        • seqadmin
                          Recent Developments in Metagenomics
                          by seqadmin





                          Metagenomics has improved the way researchers study microorganisms across diverse environments. Historically, studying microorganisms relied on culturing them in the lab, a method that limits the investigation of many species since most are unculturable1. Metagenomics overcomes these issues by allowing the study of microorganisms regardless of their ability to be cultured or the environments they inhabit. Over time, the field has evolved, especially with the advent...
                          09-23-2024, 06:35 AM

                        ad_right_rmr

                        Collapse

                        News

                        Collapse

                        Topics Statistics Last Post
                        Started by seqadmin, 10-11-2024, 06:55 AM
                        0 responses
                        11 views
                        0 likes
                        Last Post seqadmin  
                        Started by seqadmin, 10-02-2024, 04:51 AM
                        0 responses
                        110 views
                        0 likes
                        Last Post seqadmin  
                        Started by seqadmin, 10-01-2024, 07:10 AM
                        0 responses
                        114 views
                        0 likes
                        Last Post seqadmin  
                        Started by seqadmin, 09-30-2024, 08:33 AM
                        1 response
                        121 views
                        0 likes
                        Last Post EmiTom
                        by EmiTom
                         
                        Working...
                        X