Announcement

Collapse

Welcome to the New Seqanswers!

Welcome to the new Seqanswers! We'd love your feedback, please post any you have to this topic: New Seqanswers Feedback.
See more
See less

cufflinks 1.2.0 version got me significantly different results than the old version

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • cufflinks 1.2.0 version got me significantly different results than the old version

    I don't know if anyone else has experienced a similar problem as I had. I am analyzing several RNAseq samples following the tophat-->cuffdiff pipeline to determine differentially expressed genes. I started the analysis two weeks ago and got some preliminary results on my treatment vs control comparison (6 samples in total, 3 vs 3) using cufflinks 1.1.3. The cuffdiff program indicated that there are about 173 genes differentially expressed between two conditions (from the gene-exp.diff file).

    When cufflinks released the newest version 1.2.0 on Nov. 23th, I reran the algorithm and saw a HUGE difference on the cuffdiff results using EXACTLY the same command and the same human reference annotation file hg19. This time, the cuffdiff indicated a total of 5359 differentially expressed genes (from the gene-exp.diff file).

    I am totally confused and do not know which version I should trust. How come there is such a big difference on the differentially expressed genes?
    Did anyone experience a similar problem?

  • #2
    You feel my pain

    We've had a similar issue with DESeq and EdgeR, here. What I do is try and look at the 'significant' genes and check that the expression values agree with what the program tells you.

    I suggest you contact the authors of cuffdiff and ask them what they've changed in the algorithm. In the release notes on 1.1.2 they mention several bug fixes, which may make such a huge difference. Or it's a new bug :-S
    Last edited by chris; 11-30-2011, 08:41 AM.

    Comment


    • #3
      One big change was the assignment of "FAIL" estimations. Depending on your setting those would have been eliminated from your fold change list in the old version but show up now as they should in the new version. If you looked at the previous version almost 50% of the genes or nearly 100% of genes with multiple transcripts were listed as "FAIL"

      Comment


      • #4
        cufflinks 1.2.1 has been released (11/30/2011). Try to check out n° of differentially expressed genes with this version.

        Comment


        • #5
          @slowsmile - can you comment on the p-values / fold changes, especially in light of @Jon_Keats' post? Do the stats for the original 173 genes (presumably the most obvious DE cases) remain roughly the same?

          Comment


          • #6
            1.2.1 does something different at Cuffmerge

            Just to add something to this in case it's useful: I'm experiencing a significant attrition in genes at the Cuffmerge step in 1.2.1, relative to 1.1.0 in my analysis. I'm not sure why this is happening, but I've used the 1.1.0 version before I pass to Cuffdiff 1.2.1.

            Comment


            • #7
              @pinin4fjords -

              I am experiencing the same thing. I have 6 cufflinks transcripts.gtf files (each ~110MB) that I am merging with cuffdiff. 1.1.0 gave me a merged file of 140MB and 1.2.1 of ~70MB. The 1.2.1 merged.gtf is missing a number of known expressed genes (that i see in the original transcripts.gtf files), so something is funky. I have a help request in.

              Comment


              • #8
                @dweebis -

                That's suspiciously similar to the reduction I saw- my files basically halved in size. I also have a support request in.
                Last edited by pinin4fjords; 12-07-2011, 05:35 AM. Reason: typo

                Comment


                • #9
                  I add this in case it helps anybody figure out what is going on here: I ran Cufflinks 1.2.1 on a set of 4 samples from Arabidopsis. Now the curious thing happens when I run different versions of Cuffmerge on the transcript.gtf files, which each have about 280,000 entries. While version 1.1.0 gives a merged.gtf with 240,000 entries, version 1.2.1 comes up with only 27,000 entries when I use exactly the same command lines. So the difference really seems to be introduced downstream of Cufflinks.

                  Comment


                  • #10
                    Just an update for anyone stumbling across this thread- Cuffmerge in Cufflinks 1.3.0 is no longer eating transcripts. It also gives me a few more significant results (I'm using Tophat 1.4).

                    Comment

                    Working...
                    X