Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • interpreting cuffdiff output with and without replicates

    Hi SEQers

    I am calling on your expertise!

    I have an RNA Seq data set which is a growth time series.

    I have used Tophat->Cufflinks->Cuffdiff on these data.

    To look at differential gene expression between 6 time points I have use the cuffdiff output file the gene_exp.diff file.

    I have run cuffdiff three different ways
    1. with the -T (time series) option
    2. with out the -T option
    3. just using two time points - comparing the first time point to each successive time point.

    Each of these runs produces different numbers of significantly different genes between any two time points. As one might expect, the overall the numbers of genes with significant differential expression is much lower in case 1 and 2 compared to case 3.

    I would like to understand the reason for this.

    Looking at the gene_exp.diff file output from cuffdiff particularly column 10 the "test stat" this number is different if looking at a .diff file that came from a comparison of two files (case 3) vs one which came from all 6 time points (cases 1 and 2).

    Reading the cufflink manuel and associate information on line it looks like "test stat" is calculated based on a variance of the fpkm's of each replicate.

    I am assuming that these differences are because when I use these 6 files (.bams) in cuffdiff with or with out the -T option (case 1 or 2 ) cuffdiff uses these 6 files as though they are replicates. In case 3 where I have only use two bam files for cuffdiff
    for some genes there are two replicates and others there are no replicates.

    My question is how would you interpret the data from case 3? Should I disregard case 3 and only consider data from cases 1 and 2?

    Thanks for your insights,

    Cynthia


    from http://cufflinks.cbcb.umd.edu/howitworks#hdif

    "Note that in order to calculate the test statistic T, we need to know the variance of the expression level in each condition. The variance needs to include the variability in the number of fragments generated by the transcript across replicates, and should also incorporate any uncertainty in the expression estimate itself."

  • #2
    Whoops where I said .bam I meant .sam!

    Comment


    • #3
      Actually I think this assumption of mine in not correct:

      I am assuming that these differences are because when I use these 6 files (.sam) in cuffdiff with or with out the -T option (case 1 or 2 ) cuffdiff uses these 6 files as though they are replicates. In case 3 where I have only use two bam files for cuffdiff
      for some genes there are two replicates and others there are no replicates.

      These are not considered replicates - so why am I getting different "test stat" numbers in the gene_exp.diff files?

      Comment


      • #4
        I use Tophat outfiles (.sam) in cuffdiff always without the -T option.

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Recent Advances in Sequencing Analysis Tools
          by seqadmin


          The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
          05-06-2024, 07:48 AM
        • seqadmin
          Essential Discoveries and Tools in Epitranscriptomics
          by seqadmin




          The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
          04-22-2024, 07:01 AM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, 05-14-2024, 07:03 AM
        0 responses
        19 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 05-10-2024, 06:35 AM
        0 responses
        43 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 05-09-2024, 02:46 PM
        0 responses
        53 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 05-07-2024, 06:57 AM
        0 responses
        42 views
        0 likes
        Last Post seqadmin  
        Working...
        X