I have been running tophat/cufflinks/cuffdiff on fungal and human RNAseq data. Some of the FPKM values seemed high so I decided to look at the alignment files (accepted_hits.bam) to count numbers of reads hitting selected genes. I am unable to come up with anything near the values produced in the cuffdiff output. For example one cufflinks locus (XLOC) had a reported FPKM of 421 yet there were zero reads mapping in the corresponding genomic region.
A related issue is that some of the reported loci span genomic regions well beyond the borders of transcripts defined in the supplied .gtf file. However, inspection of the alignment file reveals no reads that support the extension of the transcript.
Am I missing something here? Specifically, does cufflinks use information other than that contained in the accepted_hits.bam file to calculate FPKM and define its (XLOC) loci?
A related issue is that some of the reported loci span genomic regions well beyond the borders of transcripts defined in the supplied .gtf file. However, inspection of the alignment file reveals no reads that support the extension of the transcript.
Am I missing something here? Specifically, does cufflinks use information other than that contained in the accepted_hits.bam file to calculate FPKM and define its (XLOC) loci?
Comment