Announcement

Collapse

Welcome to the New Seqanswers!

Welcome to the new Seqanswers! We'd love your feedback, please post any you have to this topic: New Seqanswers Feedback.
See more
See less

Inconsistency in Cuffdiff results

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Inconsistency in Cuffdiff results

    Hi all,

    I use cuffdiff to compare my RNA-Seq samples, and the result I got is inconsistent.

    For example, I have three samples, S1, S2, and S3. I ran cuffdiff for a pair first S1 vs. S2. Then, I ran cuffdiff for all three samples. Since cuffdiff does pair-wise, it reports all pairs. S1 vs. S2; S1 vs. S3; and S2 vs. S3.

    The results I got for S1 vs. S2 from these two runs are different. I assume they should be the same. I'm wondering is there anything I did wrong? or cuffdiff considers more factors when sample is more?

    Thanks,
    Xiaoyu

  • #2
    I have noticied this as well. I also notice that in the new version of cufflinks (2.0.2), cuffdiff produces a file with the individual RPKM values for replicates. I have 14 disease samples and 6 controls so when I run cuffdiff I have two conditions with replicates (14 and 6 in each condition). If I run the analysis disease v control I get different individual RPKM values than if I split the disease samples into "drug responder" and "drug non-responder" and re-run cuffdiff with 3 conditions (responder, non-responder, control). I would expect the individual RPKM values to be the same irrespective of the number of conditions.

    Or am I misunderstanding something?

    Thanks
    Helen

    Comment


    • #3
      Originally posted by hlwright View Post
      I have noticied this as well. I also notice that in the new version of cufflinks (2.0.2), cuffdiff produces a file with the individual RPKM values for replicates. I have 14 disease samples and 6 controls so when I run cuffdiff I have two conditions with replicates (14 and 6 in each condition). If I run the analysis disease v control I get different individual RPKM values than if I split the disease samples into "drug responder" and "drug non-responder" and re-run cuffdiff with 3 conditions (responder, non-responder, control). I would expect the individual RPKM values to be the same irrespective of the number of conditions.

      Or am I misunderstanding something?

      Thanks
      Helen
      I guess my problem is not exactly as yours, but similar. For your case, after you split the disease sample, each sample has different number of replicates than the first time you run the experiment. For my case, I have exact same sample, just adding one sample for the second run.

      Comment


      • #4
        Xiaoyu

        Yes I have a different number of replicates when I run the analysis the second time, so I might expect that the gene RPKM value in the genes.fpkm_tracking file (one RPKM for each condition/gene) would be different. However, would I not expect the individual RPKM values (in the genes.read_group_tracking file) for each sample to be the same no matter how the analysis was run?

        Helen

        Comment


        • #5
          Honestly, I don't know the answer... But, if you are checking the Cuffdiff result, I guess, they might be different, since your replicates are different, and cuffdiff will do normalization differently ...

          Comment


          • #6
            Can you share with us the command you use to run the cuffdiff ?

            Comment


            • #7
              This is the command I used. Thanks

              Code:
              cuffdiff -p 8 -o dfout -L S1,S2,S3 merged.gtf ./S1/accepted_hits.bam ./S2_R1/accepted_hits.bam,./S2_R2/accepted_hits.bam ./S3_R1/accepted_hits.bam,./S3_R2_accepted_hits.bam

              Comment


              • #8
                Originally posted by potato84 View Post
                Hi all,

                I use cuffdiff to compare my RNA-Seq samples, and the result I got is inconsistent.

                For example, I have three samples, S1, S2, and S3. I ran cuffdiff for a pair first S1 vs. S2. Then, I ran cuffdiff for all three samples. Since cuffdiff does pair-wise, it reports all pairs. S1 vs. S2; S1 vs. S3; and S2 vs. S3.

                The results I got for S1 vs. S2 from these two runs are different. I assume they should be the same. I'm wondering is there anything I did wrong? or cuffdiff considers more factors when sample is more?

                Thanks,
                Xiaoyu
                Keep in mind that in the absence of replicates, cuffdiff uses the pooled conditions to derive its dispersion estimate. So your dispersions estimates may be very different when you ran with only a pair of samples versus running with all three. That will inherently affect your estimates of significance when computing differences between pairs of samples. So you would not expect to get the same significance for those analyses.
                Michael Black, Ph.D.
                ScitoVation LLC. RTP, N.C.

                Comment


                • #9
                  Thank you for answering my questions, mbblack.

                  Then shall I expect to get the same result for S2 vs. S3? I have replicates for both samples.

                  Comment

                  Working...
                  X