Hi there,
I am wondering how to understand the second column of the combined.gtf? It's supposed to be source: the program that generated this feature. I look at my combined.gtf file and see there are several sources: "protein_coding", "pseudogene", "processed_transcript" and Cufflinks (and more). I can understand the first three sources, how Cufflinks relates to them? Does it means if the source is Cufflinks, the transcript is novel? But then I look a few of them having a Cufflinks as source, they are known genes. Some of them are protein coding genes. Why are their source not "protein_coding"?
Thanks you for your help.
I am wondering how to understand the second column of the combined.gtf? It's supposed to be source: the program that generated this feature. I look at my combined.gtf file and see there are several sources: "protein_coding", "pseudogene", "processed_transcript" and Cufflinks (and more). I can understand the first three sources, how Cufflinks relates to them? Does it means if the source is Cufflinks, the transcript is novel? But then I look a few of them having a Cufflinks as source, they are known genes. Some of them are protein coding genes. Why are their source not "protein_coding"?
Thanks you for your help.