Hi there,
I have a question to ask about Cufflinks, probably needs to be answered by the authors of the software, but I thought I would post on Seqanswers, as other people may have insights or find the answers useful.
I have processed Illumina 36b single read data for a diseased and healthy sample from the same human tissue with TopHat and Cufflinks. I am now examining the output from Cufflinks 'cuffdiff' tool - and I see 2 genes identified as having significant differential promoter use between the diseased/normal samples (in the 0_1_promoter.diff file).
My questions are:
1. These are the only 2 genes flagged as 'OK' for test status, however, other genes in the list (an excerpt is pasted below) have more significant p-values, but don't score as 'OK' and therefore not significant. As value_1 and value_2 are 0 in this output file, it is hard to see where those significance scores are coming from. Could someone tell me why those seemingly significant genes aren't flagged as OK?
2. How does Cuffdiff identify an alternative promoter from the data, as separate from an alternative TSS within the same promoter? - ie, you can have more than one TSS in a single promoter - how does Cufflinks identify the existence of a separate promoter (is it purely on the distance between the TSS ?)
3. I checked the 2 genes showing significant for alt promoter usage (ARAF, ATP6V1B2) on Ensembl, and ARAF is identified as having more than one transcript and you can see the isoforms have a different first exon. ATP6V1B2 is only identified as having a single transcript on Ensembl - so how do I confirm that this is a novel promoter identified by TopHat or an erroneous result?
test_id gene locus status value_1 tvalue_2 sqrt(JS) test_stat p_value significant
XLOC_272790-[chrX:47428949-47432477] ARAF chrX:47428949-47432477 OK 0 0 0.57 0.57 2.80E-006 yes
XLOC_032928-[chr8:20069595-20079203] ATP6V1B2 chr8:20069595-20079203 OK 0 0 0.3 0.3 1.31E-005 yes
XLOC_294585-[chr16:30368420-30371172] TBC1D10B chr16:30368420-30371172 NOTEST 0 0 0.28 0.28 5.79E-006 no
XLOC_294440-[chr16:1573517-1574900] IFT140 chr16:1573517-1574900 NOTEST 0 0 0.08 0.08 0.3 no
XLOC_294559-[chr16:27500927-27509149] GTF3C1 chr16:27500927-27509149 NOTEST 0 0 0.18 0.18 0.03 no
XLOC_000690-[chr11:47188015-47190085] ARFGAP2 chr11:47188015-47190085 NOTEST 0 0 0.06 0.06 0.25 no
XLOC_294278-[chr16:67165182-67166842] C16orf70 chr16:67165182-67166842 NOTEST 0 0 0.53 0.53 0.01 no
XLOC_164029-[chr12:39724560-39726544] KIF21A chr12:39724560-39726544 NOTEST 0 0 0.2 0.2 0.15 no
XLOC_294424-[chr16:735253-736528] WDR24 chr16:735253-736528 NOTEST 0 0 0.21 0.21 0.02 no
XLOC_294415-[chr16:322128-325610] RGS11 chr16:322128-325610 NOTEST 0 0 0.09 0.09 0 no
XLOC_338010-[chr19:39421595-39423782] MRPS12 chr19:39421595-39423782 NOTEST 0 0 0.02 0.02 0.47 no
XLOC_294757-[chr16:85008028-85012881] ZDHHC7 chr16:85008028-85012881 NOTEST 0 0 0.3 0.3 0.05 no
XLOC_317374-[chr9:35841478-35846385] C9orf127 chr9:35841478-35846385 NOTEST 0 0 0.22 0.22 0.15 no
XLOC_317330-[chr9:32430425-32434696] ACO1 chr9:32430425-32434696 NOTEST 0 0 0.03 0.03 0.44 no
XLOC_187340-[chr22:21088597-21097180] PI4KA chr22:21088597-21097180 NOTEST 0 0 0.2 0.2 7.39E-005 no
XLOC_294696-[chr16:69404385-69419651] TERF2 chr16:69404385-69419651 NOTEST 0 0 0.19 0.19 0.07 no
XLOC_364256-[chr5:110459528-110462543] WDR36 chr5:110459528-110462543 NOTEST 0 0 0.64 0.64 2.12E-005 no
XLOC_294705-[chr16:70543831-70557439] COG4 chr16:70543831-70557439 NOTEST 0 0 0.19 0.19 0.15 no
XLOC_294703-[chr16:70292875-70302294] AARS chr16:70292875-70302294 NOTEST 0 0 0.48 0.48 0 no
XLOC_283427-[chr13:36900711-36909962] SPG20 chr13:36900711-36909962 NOTEST 0 0 0.06 0.06 0.16 no
XLOC_137558-[chr17:61877643-61886361] DDX42 chr17:61877643-61886361 NOTEST 0 0 0.19 0.19 2.51E-008 no
Cheers, thanks for your help.
I have a question to ask about Cufflinks, probably needs to be answered by the authors of the software, but I thought I would post on Seqanswers, as other people may have insights or find the answers useful.
I have processed Illumina 36b single read data for a diseased and healthy sample from the same human tissue with TopHat and Cufflinks. I am now examining the output from Cufflinks 'cuffdiff' tool - and I see 2 genes identified as having significant differential promoter use between the diseased/normal samples (in the 0_1_promoter.diff file).
My questions are:
1. These are the only 2 genes flagged as 'OK' for test status, however, other genes in the list (an excerpt is pasted below) have more significant p-values, but don't score as 'OK' and therefore not significant. As value_1 and value_2 are 0 in this output file, it is hard to see where those significance scores are coming from. Could someone tell me why those seemingly significant genes aren't flagged as OK?
2. How does Cuffdiff identify an alternative promoter from the data, as separate from an alternative TSS within the same promoter? - ie, you can have more than one TSS in a single promoter - how does Cufflinks identify the existence of a separate promoter (is it purely on the distance between the TSS ?)
3. I checked the 2 genes showing significant for alt promoter usage (ARAF, ATP6V1B2) on Ensembl, and ARAF is identified as having more than one transcript and you can see the isoforms have a different first exon. ATP6V1B2 is only identified as having a single transcript on Ensembl - so how do I confirm that this is a novel promoter identified by TopHat or an erroneous result?
test_id gene locus status value_1 tvalue_2 sqrt(JS) test_stat p_value significant
XLOC_272790-[chrX:47428949-47432477] ARAF chrX:47428949-47432477 OK 0 0 0.57 0.57 2.80E-006 yes
XLOC_032928-[chr8:20069595-20079203] ATP6V1B2 chr8:20069595-20079203 OK 0 0 0.3 0.3 1.31E-005 yes
XLOC_294585-[chr16:30368420-30371172] TBC1D10B chr16:30368420-30371172 NOTEST 0 0 0.28 0.28 5.79E-006 no
XLOC_294440-[chr16:1573517-1574900] IFT140 chr16:1573517-1574900 NOTEST 0 0 0.08 0.08 0.3 no
XLOC_294559-[chr16:27500927-27509149] GTF3C1 chr16:27500927-27509149 NOTEST 0 0 0.18 0.18 0.03 no
XLOC_000690-[chr11:47188015-47190085] ARFGAP2 chr11:47188015-47190085 NOTEST 0 0 0.06 0.06 0.25 no
XLOC_294278-[chr16:67165182-67166842] C16orf70 chr16:67165182-67166842 NOTEST 0 0 0.53 0.53 0.01 no
XLOC_164029-[chr12:39724560-39726544] KIF21A chr12:39724560-39726544 NOTEST 0 0 0.2 0.2 0.15 no
XLOC_294424-[chr16:735253-736528] WDR24 chr16:735253-736528 NOTEST 0 0 0.21 0.21 0.02 no
XLOC_294415-[chr16:322128-325610] RGS11 chr16:322128-325610 NOTEST 0 0 0.09 0.09 0 no
XLOC_338010-[chr19:39421595-39423782] MRPS12 chr19:39421595-39423782 NOTEST 0 0 0.02 0.02 0.47 no
XLOC_294757-[chr16:85008028-85012881] ZDHHC7 chr16:85008028-85012881 NOTEST 0 0 0.3 0.3 0.05 no
XLOC_317374-[chr9:35841478-35846385] C9orf127 chr9:35841478-35846385 NOTEST 0 0 0.22 0.22 0.15 no
XLOC_317330-[chr9:32430425-32434696] ACO1 chr9:32430425-32434696 NOTEST 0 0 0.03 0.03 0.44 no
XLOC_187340-[chr22:21088597-21097180] PI4KA chr22:21088597-21097180 NOTEST 0 0 0.2 0.2 7.39E-005 no
XLOC_294696-[chr16:69404385-69419651] TERF2 chr16:69404385-69419651 NOTEST 0 0 0.19 0.19 0.07 no
XLOC_364256-[chr5:110459528-110462543] WDR36 chr5:110459528-110462543 NOTEST 0 0 0.64 0.64 2.12E-005 no
XLOC_294705-[chr16:70543831-70557439] COG4 chr16:70543831-70557439 NOTEST 0 0 0.19 0.19 0.15 no
XLOC_294703-[chr16:70292875-70302294] AARS chr16:70292875-70302294 NOTEST 0 0 0.48 0.48 0 no
XLOC_283427-[chr13:36900711-36909962] SPG20 chr13:36900711-36909962 NOTEST 0 0 0.06 0.06 0.16 no
XLOC_137558-[chr17:61877643-61886361] DDX42 chr17:61877643-61886361 NOTEST 0 0 0.19 0.19 2.51E-008 no
Cheers, thanks for your help.