Hi SEQers
I am calling on your expertise!
I have an RNA Seq data set which is a growth time series.
I have used Tophat->Cufflinks->Cuffdiff on these data.
To look at differential gene expression between 6 time points I have use the cuffdiff output file the gene_exp.diff file.
I have run cuffdiff three different ways
1. with the -T (time series) option
2. with out the -T option
3. just using two time points - comparing the first time point to each successive time point.
Each of these runs produces different numbers of significantly different genes between any two time points. As one might expect, the overall the numbers of genes with significant differential expression is much lower in case 1 and 2 compared to case 3.
I would like to understand the reason for this.
Looking at the gene_exp.diff file output from cuffdiff particularly column 10 the "test stat" this number is different if looking at a .diff file that came from a comparison of two files (case 3) vs one which came from all 6 time points (cases 1 and 2).
Reading the cufflink manuel and associate information on line it looks like "test stat" is calculated based on a variance of the fpkm's of each replicate.
I am assuming that these differences are because when I use these 6 files (.bams) in cuffdiff with or with out the -T option (case 1 or 2 ) cuffdiff uses these 6 files as though they are replicates. In case 3 where I have only use two bam files for cuffdiff
for some genes there are two replicates and others there are no replicates.
My question is how would you interpret the data from case 3? Should I disregard case 3 and only consider data from cases 1 and 2?
Thanks for your insights,
Cynthia
from http://cufflinks.cbcb.umd.edu/howitworks#hdif
"Note that in order to calculate the test statistic T, we need to know the variance of the expression level in each condition. The variance needs to include the variability in the number of fragments generated by the transcript across replicates, and should also incorporate any uncertainty in the expression estimate itself."
I am calling on your expertise!
I have an RNA Seq data set which is a growth time series.
I have used Tophat->Cufflinks->Cuffdiff on these data.
To look at differential gene expression between 6 time points I have use the cuffdiff output file the gene_exp.diff file.
I have run cuffdiff three different ways
1. with the -T (time series) option
2. with out the -T option
3. just using two time points - comparing the first time point to each successive time point.
Each of these runs produces different numbers of significantly different genes between any two time points. As one might expect, the overall the numbers of genes with significant differential expression is much lower in case 1 and 2 compared to case 3.
I would like to understand the reason for this.
Looking at the gene_exp.diff file output from cuffdiff particularly column 10 the "test stat" this number is different if looking at a .diff file that came from a comparison of two files (case 3) vs one which came from all 6 time points (cases 1 and 2).
Reading the cufflink manuel and associate information on line it looks like "test stat" is calculated based on a variance of the fpkm's of each replicate.
I am assuming that these differences are because when I use these 6 files (.bams) in cuffdiff with or with out the -T option (case 1 or 2 ) cuffdiff uses these 6 files as though they are replicates. In case 3 where I have only use two bam files for cuffdiff
for some genes there are two replicates and others there are no replicates.
My question is how would you interpret the data from case 3? Should I disregard case 3 and only consider data from cases 1 and 2?
Thanks for your insights,
Cynthia
from http://cufflinks.cbcb.umd.edu/howitworks#hdif
"Note that in order to calculate the test statistic T, we need to know the variance of the expression level in each condition. The variance needs to include the variability in the number of fragments generated by the transcript across replicates, and should also incorporate any uncertainty in the expression estimate itself."
Comment