Announcement

Collapse
No announcement yet.

DEGseq or edgeR

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • DEGseq or edgeR

    Hi,

    I need to analyze differentially expressed genes between samples from two tissues. I was thinking about using DEGseq or edgeR packages,

    any of you have tried these packages?

    thanks!

    marina

  • #2
    I had RNA-Seq data that I:

    1) mapped with tophat
    2) determined RPKM/transcript with cufflinks
    3) analyzed differentially expressed transcripts with DEGseq

    Using these software packages the analysis worked pretty straight-forward. Using tophat and cufflinks I only had to use the function "DEGexp" from the DEGseq package, but you should be able to skip Cufflinks and feed DEGseq uniquely aligned sequences directly as well.

    svl

    Comment


    • #3
      Originally posted by svl View Post
      I had RNA-Seq data that I:

      1) mapped with tophat
      2) determined RPKM/transcript with cufflinks
      3) analyzed differentially expressed transcripts with DEGseq

      Using these software packages the analysis worked pretty straight-forward. Using tophat and cufflinks I only had to use the function "DEGexp" from the DEGseq package, but you should be able to skip Cufflinks and feed DEGseq uniquely aligned sequences directly as well.

      svl
      Hi svl,

      Since I didn't use cufflinks much, I have a detailed question:
      You said you determined RPKM/transcript by Cufflinks, so I am wondering how you make sure the transcripts determined by Cufflinks from different samples match. Do you use such kind of gene annotation to guide Cufflinks?
      Thanks in advance.
      Xi Wang

      Comment


      • #4
        Originally posted by Xi Wang View Post
        Do you use such kind of gene annotation to guide Cufflinks?
        Yep, exactly, using a GFF file. From the cufflinks manual:
        "-G/--GTF -> Tells Cufflinks to use the supplied reference annotation to estimate isoform expression. It will not assemble novel transcripts, and the program will ignore alignments not structurally compatible with any reference transcript."

        Comment


        • #5
          Originally posted by svl View Post
          Yep, exactly, using a GFF file. From the cufflinks manual:
          "-G/--GTF -> Tells Cufflinks to use the supplied reference annotation to estimate isoform expression. It will not assemble novel transcripts, and the program will ignore alignments not structurally compatible with any reference transcript."
          Got it. Thanks again and thanks for using and recommending DEGseq package
          Xi Wang

          Comment


          • #6
            Hi!

            it's good to know that DEGseq works fine. I would like to know if it's necessary to have several sample replicates to use this package,

            thanks!

            marina

            Comment


            • #7
              Originally posted by mmanrique View Post
              Hi!

              it's good to know that DEGseq works fine. I would like to know if it's necessary to have several sample replicates to use this package,

              thanks!

              marina
              If technical replicates, DEGseq or DEGexp can deal with; if biological replicates, it is recommended to use samWrapper in the package.
              Following the examples in the manual, you can get to know how to use DEGseq.
              Xi Wang

              Comment


              • #8
                Third possibility to look for differential expression in RNA-Seq data: DESeq

                Hi,

                given the title of the thread, I have to use the opportunity to advertise our new package "DESeq", which is now a third option in Bioconductor for determining whether a fold change in RNA-Seq data is significant.

                Like edgeR, DESeq uses the negative binomial distribution. However, we use a novel way of estimating the variance between biological replicates that is, in out view, more precise than edgeR's.

                The package is in Bioc devel (see also here).

                The paper describing the method is submitted; contact me if you would like to get a preprint.

                In this paper, we also argue that the Poisson approximation is not suitable for RNA-Seq analysis and a dispersion estimate in indispensible. Note that this explicitly contradicts the opinion that the DEGSeq authors state in their package vignette.

                Hence, your choice is as follows: If you go with Xi Wang et al.'s opinion that Poisson is justified, use DEGSeq, while if you agree with the negative binomial people, i.e., the authors of edgeR and DESeq, you go with these.

                I've tried to make DESeq easy to use and fast, and I hope you all will like it. Feedback is very wellcome.

                In comparing all three methods, you will typically find that edgeR and DESeq find about the same number of hits, but with different distribution in across the range of abundances. In our paper, we argue while we think that our newer tool gets closer to the truth. DEGSeq (the Poisson-base method) will give you many more hits than edgeR and DESeq.

                I don't want to go into details (this is what we have written the paper for ;-) ) but as Xi Wang has already posted in this thread, it would be impolite to not at least briefly mention why we advise against relying on the Poisson assumption: As Marioni et al. [Genome Res., 2008] have shown, the noise between technical replicates is indeed at the theoretical minimum, i.e., the level predicted by the Poisson distribution. However, the noise between biological replicates is, unsurprisingly, much higher (see the comparison between technical and biological replicates by Nagalakshmi et al. [Science, 2008]) and vastly exceeds the noise predicted by the Poisson assumption.

                Hence, if you test against a Poisson null hypothesis and reject it (i.e., call it differentially expressed), this informs you that the difference of the transcript abundance between your two samples is larger than what you would expect between technical replicates. The question you are typically interested is, however, whether it is larger than what one would expect between biological replicates, as only then, it can be attributed to a difference in the treatment or characteristics of the biological sample. Hence, it is important to measure the noise between biological replicates, and the fact that the noise between technical replicates can be calculated from the Poisson distribution does not help.

                Best regards
                Simon

                Comment


                • #9
                  I recently used edgeR and works really good, I also used TopHat but then calculated the RPKMs with my own scripts and edgeR did the DE analysis, it is easy to use and creates really nice scatterplots of the data.
                  Cheers,
                  Sergio

                  Comment


                  • #10
                    With the recently released Cufflinks 0.8.0, you can use the included program "cuffdiff" to test for differential gene and isoform expression, as well as differential splicing and differential promoter use within genes.

                    Comment


                    • #11
                      Thanks a lot for the answers

                      Comment

                      Working...
                      X