I encountered a strange problem while using cuffdiff 2 to analyze my rna-seq data.
I used tophat 1.4.1 to map my rna-seq data to pig genome (susscorfa 10.2), then i ran DE analysis with cuffdiff (both v1.3 and v2.0.2).
the command i used is as following (for both version of cuffdiff)
cuffdiff --no-update-check -v -o diff_out_mask_upperN -M rRNA.gtf --library-type fr-secondstrand -N -u -b Sus_scrofa.Sscrofa10.2.68.dna.toplevel.fa -p 16 -L ConditionA,ConditionB Sus_scrofa.Sscrofa10.2.68.gtf ConditionA.bam ConditionB.bam
Strangely, the result from cuffdiff v2.0.2 showed the expression of all transcripts is zero! (only show first couple of lines in gene_exp.diff)
test_id gene_id gene locus sample_1 sample_2 status value_1 value_2 log2(fold_change) test_stat p_value q_value significant
ENSSSCG00000000001 ENSSSCG00000000001 - chr5:588810-596477 Infa Remo NOTEST 0 0 0 0 1 1 no
ENSSSCG00000000002 ENSSSCG00000000002 GTSE1 chr5:544685-564101 Infa Remo NOTEST 0 0 0 0 1 1 no
ENSSSCG00000000003 ENSSSCG00000000003 TTC38 chr5:520893-543410 Infa Remo NOTEST 0 0 0 0 1 1 no
ENSSSCG00000000004 ENSSSCG00000000004 PKDREJ chr5:509043-516263 Infa Remo NOTEST 0 0 0 0 1 1 no
ENSSSCG00000000005 ENSSSCG00000000005 C22ORF40 chr5:496539-504125 Infa Remo NOTEST 0 0 0 0 1 1 no
ENSSSCG00000000006 ENSSSCG00000000006 PPARA chr5:424928-488440 Infa Remo NOTEST 0 0 0 0 1 1 no
ENSSSCG00000000007 ENSSSCG00000000007 TRMU chr5:569580-583238 Infa Remo NOTEST 0 0 0 0 1 1 no
ENSSSCG00000000010 ENSSSCG00000000010 FBLN1 chr5:1071719-1158793 Infa Remo NOTEST 0 0 0 0 1 1 no
ENSSSCG00000000011 ENSSSCG00000000011 - chr5:1324339-1373617 Infa Remo NOTEST 0 0 0 0 1 1 no
but the result from cuffdiff 1.3 showed most transcripts were expressed:
ENSSSCG00000000001 ENSSSCG00000000001 - chr5:588810-596477 Infa Remo NOTEST 0.515866 0.815506 0.660699 -0.560886 0.574875 1 no
ENSSSCG00000000002 ENSSSCG00000000002 GTSE1 chr5:544685-564101 Infa Remo NOTEST 2.95675 3.76521 0.348715 -0.75018 0.453146 1 no
ENSSSCG00000000003 ENSSSCG00000000003 TTC38 chr5:520893-543410 Infa Remo NOTEST 2.32987 2.75627 0.242466 -0.429257 0.667736 1 no
ENSSSCG00000000004 ENSSSCG00000000004 PKDREJ chr5:509043-516263 Infa Remo NOTEST 0.422011 0.45496 0.108459 -0.150356 0.880484 1 no
ENSSSCG00000000005 ENSSSCG00000000005 C22ORF40 chr5:496539-504125 Infa Remo OK 3.54952 4.91868 0.470645 -0.744261 0.456718 0.882984 no
ENSSSCG00000000006 ENSSSCG00000000006 PPARA chr5:424928-488440 Infa Remo OK 11.487 20.1291 0.809283 -2.13796 0.03252 0.415541 no
ENSSSCG00000000007 ENSSSCG00000000007 TRMU chr5:569580-583238 Infa Remo NOTEST 2.27422 1.42395 -0.675472 1.22796 0.219461 1 no
ENSSSCG00000000010 ENSSSCG00000000010 FBLN1 chr5:1071719-1158793 Infa Remo OK 41.1482 36.0558 -0.190598 0.593366 0.552936 0.906035 no
ENSSSCG00000000011 ENSSSCG00000000011 - chr5:1324339-1373617 Infa Remo NOTEST 0 0 0 0 1 1 no
when I removed frag-bias-correct parameter while analyzing DE with cuffdiff 2.0.2,
cuffdiff --no-update-check -v -o merged_anno_bias_correct_no_scaffold_mask_upperN -M rRNA.gtf --library-type fr-secondstrand -N -u -p 16 -L ConditionA,ConditionB Sus_scrofa.Sscrofa10.2.68.gtf ConditionA.bam ConditionB.bam
the result seemed much normal:
ENSSSCG00000000001 ENSSSCG00000000001 - chr5:588810-596477 Infa Remo NOTEST 0.0163158 0.046151 1.50009 -1.16742 0.24304 1 no
ENSSSCG00000000002 ENSSSCG00000000002 GTSE1 chr5:544685-564101 Infa Remo OK 0.0846721 0.181485 1.09989 -1.68917 0.0911874 0.474875 no
ENSSSCG00000000003 ENSSSCG00000000003 TTC38 chr5:520893-543410 Infa Remo OK 0.0609228 0.119559 0.972666 -1.21118 0.225826 0.599345 no
ENSSSCG00000000004 ENSSSCG00000000004 PKDREJ chr5:509043-516263 Infa Remo NOTEST 0.0112047 0.0208992 0.89934 -0.936853 0.348834 1 no
ENSSSCG00000000005 ENSSSCG00000000005 C22ORF40 chr5:496539-504125 Infa Remo OK 0.131638 0.293968 1.15908 -1.36223 0.173124 0.554138 no
ENSSSCG00000000006 ENSSSCG00000000006 PPARA chr5:424928-488440 Infa Remo OK 0.365693 1.11547 1.60895 -3.03111 0.00243654 0.298428 no
ENSSSCG00000000007 ENSSSCG00000000007 TRMU chr5:569580-583238 Infa Remo NOTEST 0.0657846 0.0688074 0.0648129 -0.0842925 0.932824 1 no
ENSSSCG00000000010 ENSSSCG00000000010 FBLN1 chr5:1071719-1158793 Infa Remo OK 1.23995 1.8773 0.598383 -1.33535 0.181762 0.562518 no
ENSSSCG00000000011 ENSSSCG00000000011 - chr5:1324339-1373617 Infa Remo NOTEST 0 0 0 0 1 1 no
what i don't get is that when using same genome sequence used in cuffdiff v1.3, why cuffdiff v2.0.2 reported the expression of all transcript are zero?
Any comment would be appreciated. thanks.
I used tophat 1.4.1 to map my rna-seq data to pig genome (susscorfa 10.2), then i ran DE analysis with cuffdiff (both v1.3 and v2.0.2).
the command i used is as following (for both version of cuffdiff)
cuffdiff --no-update-check -v -o diff_out_mask_upperN -M rRNA.gtf --library-type fr-secondstrand -N -u -b Sus_scrofa.Sscrofa10.2.68.dna.toplevel.fa -p 16 -L ConditionA,ConditionB Sus_scrofa.Sscrofa10.2.68.gtf ConditionA.bam ConditionB.bam
Strangely, the result from cuffdiff v2.0.2 showed the expression of all transcripts is zero! (only show first couple of lines in gene_exp.diff)
test_id gene_id gene locus sample_1 sample_2 status value_1 value_2 log2(fold_change) test_stat p_value q_value significant
ENSSSCG00000000001 ENSSSCG00000000001 - chr5:588810-596477 Infa Remo NOTEST 0 0 0 0 1 1 no
ENSSSCG00000000002 ENSSSCG00000000002 GTSE1 chr5:544685-564101 Infa Remo NOTEST 0 0 0 0 1 1 no
ENSSSCG00000000003 ENSSSCG00000000003 TTC38 chr5:520893-543410 Infa Remo NOTEST 0 0 0 0 1 1 no
ENSSSCG00000000004 ENSSSCG00000000004 PKDREJ chr5:509043-516263 Infa Remo NOTEST 0 0 0 0 1 1 no
ENSSSCG00000000005 ENSSSCG00000000005 C22ORF40 chr5:496539-504125 Infa Remo NOTEST 0 0 0 0 1 1 no
ENSSSCG00000000006 ENSSSCG00000000006 PPARA chr5:424928-488440 Infa Remo NOTEST 0 0 0 0 1 1 no
ENSSSCG00000000007 ENSSSCG00000000007 TRMU chr5:569580-583238 Infa Remo NOTEST 0 0 0 0 1 1 no
ENSSSCG00000000010 ENSSSCG00000000010 FBLN1 chr5:1071719-1158793 Infa Remo NOTEST 0 0 0 0 1 1 no
ENSSSCG00000000011 ENSSSCG00000000011 - chr5:1324339-1373617 Infa Remo NOTEST 0 0 0 0 1 1 no
but the result from cuffdiff 1.3 showed most transcripts were expressed:
ENSSSCG00000000001 ENSSSCG00000000001 - chr5:588810-596477 Infa Remo NOTEST 0.515866 0.815506 0.660699 -0.560886 0.574875 1 no
ENSSSCG00000000002 ENSSSCG00000000002 GTSE1 chr5:544685-564101 Infa Remo NOTEST 2.95675 3.76521 0.348715 -0.75018 0.453146 1 no
ENSSSCG00000000003 ENSSSCG00000000003 TTC38 chr5:520893-543410 Infa Remo NOTEST 2.32987 2.75627 0.242466 -0.429257 0.667736 1 no
ENSSSCG00000000004 ENSSSCG00000000004 PKDREJ chr5:509043-516263 Infa Remo NOTEST 0.422011 0.45496 0.108459 -0.150356 0.880484 1 no
ENSSSCG00000000005 ENSSSCG00000000005 C22ORF40 chr5:496539-504125 Infa Remo OK 3.54952 4.91868 0.470645 -0.744261 0.456718 0.882984 no
ENSSSCG00000000006 ENSSSCG00000000006 PPARA chr5:424928-488440 Infa Remo OK 11.487 20.1291 0.809283 -2.13796 0.03252 0.415541 no
ENSSSCG00000000007 ENSSSCG00000000007 TRMU chr5:569580-583238 Infa Remo NOTEST 2.27422 1.42395 -0.675472 1.22796 0.219461 1 no
ENSSSCG00000000010 ENSSSCG00000000010 FBLN1 chr5:1071719-1158793 Infa Remo OK 41.1482 36.0558 -0.190598 0.593366 0.552936 0.906035 no
ENSSSCG00000000011 ENSSSCG00000000011 - chr5:1324339-1373617 Infa Remo NOTEST 0 0 0 0 1 1 no
when I removed frag-bias-correct parameter while analyzing DE with cuffdiff 2.0.2,
cuffdiff --no-update-check -v -o merged_anno_bias_correct_no_scaffold_mask_upperN -M rRNA.gtf --library-type fr-secondstrand -N -u -p 16 -L ConditionA,ConditionB Sus_scrofa.Sscrofa10.2.68.gtf ConditionA.bam ConditionB.bam
the result seemed much normal:
ENSSSCG00000000001 ENSSSCG00000000001 - chr5:588810-596477 Infa Remo NOTEST 0.0163158 0.046151 1.50009 -1.16742 0.24304 1 no
ENSSSCG00000000002 ENSSSCG00000000002 GTSE1 chr5:544685-564101 Infa Remo OK 0.0846721 0.181485 1.09989 -1.68917 0.0911874 0.474875 no
ENSSSCG00000000003 ENSSSCG00000000003 TTC38 chr5:520893-543410 Infa Remo OK 0.0609228 0.119559 0.972666 -1.21118 0.225826 0.599345 no
ENSSSCG00000000004 ENSSSCG00000000004 PKDREJ chr5:509043-516263 Infa Remo NOTEST 0.0112047 0.0208992 0.89934 -0.936853 0.348834 1 no
ENSSSCG00000000005 ENSSSCG00000000005 C22ORF40 chr5:496539-504125 Infa Remo OK 0.131638 0.293968 1.15908 -1.36223 0.173124 0.554138 no
ENSSSCG00000000006 ENSSSCG00000000006 PPARA chr5:424928-488440 Infa Remo OK 0.365693 1.11547 1.60895 -3.03111 0.00243654 0.298428 no
ENSSSCG00000000007 ENSSSCG00000000007 TRMU chr5:569580-583238 Infa Remo NOTEST 0.0657846 0.0688074 0.0648129 -0.0842925 0.932824 1 no
ENSSSCG00000000010 ENSSSCG00000000010 FBLN1 chr5:1071719-1158793 Infa Remo OK 1.23995 1.8773 0.598383 -1.33535 0.181762 0.562518 no
ENSSSCG00000000011 ENSSSCG00000000011 - chr5:1324339-1373617 Infa Remo NOTEST 0 0 0 0 1 1 no
what i don't get is that when using same genome sequence used in cuffdiff v1.3, why cuffdiff v2.0.2 reported the expression of all transcript are zero?
Any comment would be appreciated. thanks.