I have the same problem as other guys posted before: Cufflinks works well without a reference GTF file, while with reference, it gives 0 FPKM values. And this is not always for all genes, only some genes are like that. I am really confused.
Here are my input files:
htt.sam (output from Tophat, for example purpose, I only contain reads located in HTT genes regions, >3000 reads. See the screenshot of these aligned reads in UCSC)
htt.gtf (GTF for HTT genes, extracted from ensemble gtf file. I think the format is same as the GTF format Cufflinks website linked)
Here is my code (my cufflinks version: 0.9.3.Linux_x86_64):
You can test it easily. Here is the output from screen:
Here is the content in output genes.expr file:
Here is the content of output transcripts.expr file:
It seems that cufflinks has read the GTF file correctly and the reads are obviously mapped to the gene. But why the FPKM is 0 ??
Anyone have clue for this? You can test the above code in your machine. Pls let me know if you got different results. THANKS!!
Here are my input files:
htt.sam (output from Tophat, for example purpose, I only contain reads located in HTT genes regions, >3000 reads. See the screenshot of these aligned reads in UCSC)
htt.gtf (GTF for HTT genes, extracted from ensemble gtf file. I think the format is same as the GTF format Cufflinks website linked)
Here is my code (my cufflinks version: 0.9.3.Linux_x86_64):
Code:
cufflinks -G htt.gtf htt.sam
Code:
$ /home/dongx/bin/cufflinks-0.9.3.Linux_x86_64/cufflinks -G htt.gtf htt.sam /home/dongx/bin/cufflinks-0.9.3.Linux_x86_64/cufflinks: /usr/lib64/libz.so.1: no version information available (required by /home/dongx/bin/cufflinks-0.9.3.Linux_x86_64/cufflinks) [bam_header_read] EOF marker is absent. File htt.sam doesn't appear to be a valid BAM file, trying SAM... [17:33:52] Inspecting reads and determining fragment length distribution. > Processed 1 loci. [*************************] 100% > Map Properties: > Total Map Mass: 3542.44 > Read Type: 40bp single-end > Fragment Length Distribution: Gaussian (default) > Estimated Mean: 204.01 > Estimated Std Dev: 74.81 [17:33:52] Estimating transcript abundances. > Processed 1 loci. [*************************] 100%
Code:
$ cat genes.expr gene_id bundle_id chr left right FPKM FPKM_conf_lo FPKM_conf_hi status [COLOR="Red"]ENSG00000197386 3 chr4 3076406 3245676 0 0 0 OK[/COLOR]
Code:
[dongx@hpcc01 ~]$ cat transcripts.expr trans_id bundle_id chr left right FPKM FMI frac FPKM_conf_lo FPKM_conf_hi coverage length effective_length status ENST00000355072 3 chr4 3076406 3245676 0 0 0 0 0 0 13531 13531 OK ENST00000506137 3 chr4 3117054 3123439 0 0 0 0 0 0 784 784 OK ENST00000512909 3 chr4 3123125 3125193 0 0 0 0 0 0 613 613 OK ENST00000510626 3 chr4 3130064 3245667 0 0 0 0 0 0 14491 14491 OK ENST00000509618 3 chr4 3162087 3174954 0 0 0 0 0 0 432 432 OK ENST00000513639 3 chr4 3180072 3182400 0 0 0 0 0 0 261 261 OK ENST00000513326 3 chr4 3180072 3182514 0 0 0 0 0 0 375 375 OK ENST00000509043 3 chr4 3180072 3182521 0 0 0 0 0 0 382 382 OK ENST00000502820 3 chr4 3204731 3208306 0 0 0 0 0 0 290 290 OK ENST00000509751 3 chr4 3213637 3214782 0 0 0 0 0 0 725 725 OK ENST00000512068 3 chr4 3230370 3231649 0 0 0 0 0 0 403 403 OK ENST00000513806 3 chr4 3230438 3237123 0 0 0 0 0 0 436 436 OK ENST00000508321 3 chr4 3237149 3237906 0 0 0 0 0 0 388 388 OK
Anyone have clue for this? You can test the above code in your machine. Pls let me know if you got different results. THANKS!!
Originally posted by Pejman
View Post
Comment