Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • jparsons
    Member
    • Feb 2012
    • 62

    Checking Cuffdiff

    I am using an interesting dataset to "test" differential isoform expression programs.

    Unfortunately, I am not an expert in every (any?) program, so I could use some sanity checking.

    I have 3 separate tissues, ABC. I want to use (in this case) cuffdiff to identify isoforms which are uniquely expressed in A/B/C, as I can use other "ground truth" runs to verify these claims.

    I ran the program as follows, alternating A, B, and C:
    Code:
     cuffdiff -p 8 -c 10 <ucsc.gtf> A1,A2,A3 B1,B2,B3,C1,C2,C3 -o outdir
    I'm not using a cufflinks-derived gtf or (exclusively) tophat-mapped reads. I imagine I'm doing it all wrong. I have two main questions:

    1) Can I get away with not using the entire cufflinks pathway here? (If not, why doesn't the program complain?)
    2) Am I properly comparing the 3 tissues? Does A vs B,C return transcripts DE in only A, as i intend it to?
  • rboettcher
    Member
    • Oct 2010
    • 71

    #2
    Hello jparsons,

    I used cufflinks and cuffdiff with GSNAP alignments and it worked fine, so you do not need to stick to TopHat necessarily as long as the sam/bam-files have all required columns.
    However, I used the cufflinks -> cuffmerge -> cuffdiff variant to check my genes, since that way was suggested by the authors (but not very successful for me).

    After following some discussions in this forum, see

    and


    I concluded that cufflinks/cuffdiff have a problem in their correction for variance. For my analysis, the bigger my sample groups were, the fewer genes were found significantly DE until none were left. Therefore I assume that pooling group B and C will result in a similar problem due to high variance between both groups.

    Besides that, your command looks fine, so please keep us posted on your progress.

    Comment

    • jparsons
      Member
      • Feb 2012
      • 62

      #3
      Rboettcher,

      Thanks for the response. I eventually compared the output from tophat->cufflinks->cuffmerge->cuffdiff to that from only cuffdiff and found that they were (mostly) identical. I am content using cuffdiff without going through the entire pipeline.

      I got results for cuffdiff and finally managed to get RSEM to like me for long enough to spit out quantitations. When compared to the "truth" set (sadly only available on the gene level for now), the RSEM/cuffdiff lists are 'decent' individually, coming close to the expected ratio on average, but having numerous outliers. Taking the overlap set of genes called by both RSEM and cuffdiff makes for a much cleaner picture, with far less deviation from the ratio, and fewer false positives.

      I'm still working on making metrics that make sense, so 'decent' and 'cleaner' is the best i can offer for now. I imagine I will develop permissive and restrictive "true positive" lists at each ratio and then generate ROCs for each algorithm I can successfully test.

      I'm currently worried about algorithms making calls for downregulated genes or calling them as differentially expressed in cases where the assumption that "A>>B+C or A<<B+C" doesn't hold. I don't know how to handle that yet, and it may be the source of the outliers I mentioned before.

      Overall, I am actually impressed with cuffdiff's performance, given how much grief it gets here. Neither algorithm is even remotely perfect, neither is obviously superior.

      Comment

      Latest Articles

      Collapse

      • SEQadmin2
        Nine Things a Sample Prep Scientist Thinks About Before Sequencing
        by SEQadmin2


        I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.


        Here are nine questions we think about, in roughly the order they matter, before...
        Yesterday, 07:11 AM
      • SEQadmin2
        From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
        by SEQadmin2


        Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


        The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
        ...
        06-02-2026, 10:05 AM
      • SEQadmin2
        Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
        by SEQadmin2


        With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


        Introduction

        Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
        05-22-2026, 06:42 AM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by SEQadmin2, 06-17-2026, 06:09 AM
      0 responses
      16 views
      0 reactions
      Last Post SEQadmin2  
      Started by SEQadmin2, 06-09-2026, 11:58 AM
      0 responses
      37 views
      0 reactions
      Last Post SEQadmin2  
      Started by SEQadmin2, 06-05-2026, 10:09 AM
      0 responses
      43 views
      0 reactions
      Last Post SEQadmin2  
      Started by SEQadmin2, 06-04-2026, 08:59 AM
      0 responses
      49 views
      0 reactions
      Last Post SEQadmin2  
      Working...