Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Donor effect and cuffdiff

    Hello all,

    I was wondering if there's a way to deal with donor/batch effect if one wants to use cufflinks/cuffdiff pipeline. I have some human RNA-seq data, and using cuffdiff as is basically generates no differenitally expressed genes.

    However if you use design ~ donor + condition in DESeq2, there are quite a few differentially expressed genes. PCA suggests there is a strong donor effect also.

    So is there a way to address it with cuffdiff?

    Thank you in advance.

  • #2
    There isn't. Cuffdiff doesn't handle anything other than simple designs. Stick with DESeq2.

    Comment


    • #3
      I see. Thank you.

      Comment


      • #4
        Another quick question: if you are to use comBat on a RNA-seq dataset, would you use normalized log2-transformed counts? Like the results of rlog function or variance-stabilizing transformation from DESeq2?

        sva tutorial wasn't all that helpful, in all honesty.

        Comment


        • #5
          If you have known batches, just include the batch variable in the design for DESeq2.

          We don't recommend testing on transformed counts.

          If you have unknown batches, you can use svaseq or other packages. We are writing up a workflow which will be released in a few weeks and includes svaseq and RUVSeq.

          But briefly, add the SVA surrogate variables (columns of 'sv') to the colData, and then add these to the design. E.g., for two surrogate variables:

          Code:
          dds$SV1 <- svseq$sv[,1]
          dds$SV2 <- svseq$sv[,2]
          design(dds) <- ~ SV1 + SV2 + condition
          dds <- DESeq(dds)
          Last edited by Michael Love; 09-30-2014, 10:25 AM. Reason: markup

          Comment


          • #6
            Originally posted by Michael Love View Post
            If you have known batches, just include the batch variable in the design for DESeq2.

            We don't recommend testing on transformed counts.
            that's what I did right away, and it worked great. However I wanted to have an expression table with removed donor bias for PCA, visualization, etc.

            If you have unknown batches, you can use svaseq or other packages. We are writing up a workflow which will be released in a few weeks and includes svaseq and RUVSeq.

            But briefly, add the SVA surrogate variables (columns of 'sv') to the colData, and then add these to the design. E.g., for two surrogate variables:

            Code:
            dds$SV1 <- svseq$sv[,1]
            dds$SV2 <- svseq$sv[,2]
            design(dds) <- ~ SV1 + SV2 + condition
            dds <- DESeq(dds)
            Great, that would be ideal to incorporate it all into DESeq2 pipeline, because lots of things are already very conveniently done in DESeq2. Thank you for pointing the two packages out, it should help.

            Comment


            • #7
              limma has a function which easily removes batch effects from a matrix:



              (you'd want the input to be on the scale of log2 of counts, and the rlog or VST output is log2 scale)

              Comment


              • #8
                so, on a related topic - not sure it people are still reading this thread

                is there a way to evaluate donor effect quantitatively? I mean it looks fairly intuitive, check if in PCA space dots with the same donor are much closer to each other than to other dots, or something like that.

                problem is, if there are too many donors it's sometimes hard to tell.

                Comment


                • #9
                  PCA projects into a lower dimensional space, which is necessary to visualize, but also we think let's us see more signal and reduce noise. You could calculate the distance between the samples (for the 1st and 2nd PCs say, or more). You can then compute the average within-batch distance and the average within-condition distance.

                  Comment


                  • #10
                    Yep, that's something I was thinking about. Is there a ready solution or you suggest I do it myself?

                    Also, if you guys are including PCA/batch analysis and removal functions into DESeq2, it would be a cool thing to have.

                    Comment


                    • #11
                      Usually, it's best to do this kind of stuff yourself, all the functionality is there in R, and reimplementing stuff in Bioconductor is discouraged.

                      Hence we say in the help file for plotPCA: "Note that the source code of plotPCA is very simple and commented. Users should find it easy to customize this function."

                      We don't have a batch removal function, because for accounting for fold changes due to batch in testing of counts, one just adds a variable to the design. And for removing shifts from transformed counts, limma has a function which does this for you.

                      Comment


                      • #12
                        Sure, I understand. I do PCA a little different (using ggplot2 facilities), but it's still shouldn't be hard to calculate. Seems like a useful thing to have, in experiments with some 60 donors it's really not that easy to interpret the PCA plot in terms of bias...

                        Comment


                        • #13
                          By the way, check out the latest version of DESeq2 (1.6). I switched us to ggplot2 for the PCA plot. And tried to make it easier to customize. See the vignette and the workflow here:

                          bioconductor.org/help/workflows/rnaseqGene/

                          Comment


                          • #14
                            Sweet, thank you.

                            Comment

                            Latest Articles

                            Collapse

                            • seqadmin
                              Recent Advances in Sequencing Analysis Tools
                              by seqadmin


                              The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
                              05-06-2024, 07:48 AM
                            • seqadmin
                              Essential Discoveries and Tools in Epitranscriptomics
                              by seqadmin




                              The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                              04-22-2024, 07:01 AM

                            ad_right_rmr

                            Collapse

                            News

                            Collapse

                            Topics Statistics Last Post
                            Started by seqadmin, Yesterday, 06:57 AM
                            0 responses
                            12 views
                            0 likes
                            Last Post seqadmin  
                            Started by seqadmin, 05-06-2024, 07:17 AM
                            0 responses
                            16 views
                            0 likes
                            Last Post seqadmin  
                            Started by seqadmin, 05-02-2024, 08:06 AM
                            0 responses
                            19 views
                            0 likes
                            Last Post seqadmin  
                            Started by seqadmin, 04-30-2024, 12:17 PM
                            0 responses
                            24 views
                            0 likes
                            Last Post seqadmin  
                            Working...
                            X