Hi I have three biological samples (lets call them Sample_A-C)that I want to find DEGs between. Firstly, I'm performing pairwise comparisons between samples using cuffdiff using alignments generated using tophat.
The problem is when I compare the fpkm value of genes in Sample_A (in the gene_exp.diff file), the values differ depending on whether the comparison of Sample_A was against Sample_B or Sample_C.
How is this difference generated? I thought when running cuffdiff the fpkm value would be static and would originate from the mapping data in the bam/sam file used as input for cuffdiff.
Here is just the header and the first gene in the gene_exp.diff file.
Sample_A vs Sample_B
test_id gene_id gene locus sample_1 sample_2 status value_1 value_2 log2(fold_change) test_stat p_value q_value significant
ACYPI000001 ACYPI000001 acyp2eg0000191 chr1:2365250-2374181 q1 q2 OK 21.6621 25.2228 0.219556 0.389799 0.71015 0.991694 no
Sample_A vs Sample_C
test_id gene_id gene locus sample_1 sample_2 status value_1 value_2 log2(fold_change) test_stat p_value q_value significant
ACYPI000001 ACYPI000001 acyp2eg0000191 chr1:2365250-2374181 q1 q2 OK 26.0189 587.227 4.49629 4.16814 0.0011 0.422675 no
So as you can see, when compared against Sample_B, the fpkm value of ACYPI000001 in Sample_A is 21.6621 and in the second comparison it's 26.0189
There are even more pronounced differences in these datasets and I just don't get why these differences are arising.
Can someone please shed some light on what may be underlying this?
Thanks
EDIT: I'm using cufflinks version 2.1.1
cmd# cuffdiff -p 5 -o sample_A_vs_sample_B annotations.gff sample_A_accepted_hits.bam sample_B_accepted_hits.bam
The problem is when I compare the fpkm value of genes in Sample_A (in the gene_exp.diff file), the values differ depending on whether the comparison of Sample_A was against Sample_B or Sample_C.
How is this difference generated? I thought when running cuffdiff the fpkm value would be static and would originate from the mapping data in the bam/sam file used as input for cuffdiff.
Here is just the header and the first gene in the gene_exp.diff file.
Sample_A vs Sample_B
test_id gene_id gene locus sample_1 sample_2 status value_1 value_2 log2(fold_change) test_stat p_value q_value significant
ACYPI000001 ACYPI000001 acyp2eg0000191 chr1:2365250-2374181 q1 q2 OK 21.6621 25.2228 0.219556 0.389799 0.71015 0.991694 no
Sample_A vs Sample_C
test_id gene_id gene locus sample_1 sample_2 status value_1 value_2 log2(fold_change) test_stat p_value q_value significant
ACYPI000001 ACYPI000001 acyp2eg0000191 chr1:2365250-2374181 q1 q2 OK 26.0189 587.227 4.49629 4.16814 0.0011 0.422675 no
So as you can see, when compared against Sample_B, the fpkm value of ACYPI000001 in Sample_A is 21.6621 and in the second comparison it's 26.0189
There are even more pronounced differences in these datasets and I just don't get why these differences are arising.
Can someone please shed some light on what may be underlying this?
Thanks
EDIT: I'm using cufflinks version 2.1.1
cmd# cuffdiff -p 5 -o sample_A_vs_sample_B annotations.gff sample_A_accepted_hits.bam sample_B_accepted_hits.bam
Comment