I tried to use cuffcompare to annotate CuffIds against other annotations (such as lncRNA, pseudogenes, etc.). Upon this, I found cuffcompare was giving me a transcripts.gtf file with FPKMs between 0->1. Upon looking at the SAM file generated, it appears the cuffid transcripts in the gtf file are being translated back as one giant read in a sam file (thus giving the 0-1 FPKM).
The obvious, and maybe easiest step here is to append my new GTF file to the prior GTF annotation I used and re-run cufflinks. However, this seems like a poor solution to have to return to step 0 for any tweaks. What I see as a viable alternative is to output what read ids were utilized to support a given transcript. This way, I can make an 'unmapped' bam file and use that in subsequent analysis (of course correcting for the now artificially lower library size in the FPKM calculation).
Is there an existing method to do this, or am I going to have to hack at the source code?
The obvious, and maybe easiest step here is to append my new GTF file to the prior GTF annotation I used and re-run cufflinks. However, this seems like a poor solution to have to return to step 0 for any tweaks. What I see as a viable alternative is to output what read ids were utilized to support a given transcript. This way, I can make an 'unmapped' bam file and use that in subsequent analysis (of course correcting for the now artificially lower library size in the FPKM calculation).
Is there an existing method to do this, or am I going to have to hack at the source code?