Hi all,
I am very new to all things RNA-seq, so please bear with me if the questions are really basic
I am trying to compare two things for differential expression.
The pipeline I am using is: Tophat -> Cuffdiff
(with newest versions of each, Tophat 2.04 and Cufflinks 2.02)
I am skipping running Cufflinks separately before Cuffdiff, because I'm not really interested in new gene/transcript discovery.
The problem is, when I try to run Cuffdiff, it quits with an error saying the reference annotation contains duplicate GFF IDs:
The reference annotation I am using is the gff downloaded from Flybase: dmel-all-r5.46.gff.
However, when I searched this gff file, I didn't see duplicate lines containing this id, FBtr0100868.
Just to experiment though, I tried removing the lines containing the offending GFF id from the gff file, and running Cuffdiff again to see if it would fix the problem, but then it just had the same error with a different GFF id.
I tried doing this more times with each duplicate GFF id, but every time it just comes back with the same error and a different GFF id.
Has anyone else encountered this error using the gff file from Flybase, or anywhere else for that matter? I don't know if I'm doing the right thing by removing the "bad" IDs from the reference annotation either, especially since there seem to be an endless number of them. Is there any other way I should fix the reference annotation? Or would it be easier to just run Cufflinks and use its output gtf, instead of trying to fix the Flybase gff?
Any help would be very much appreciated!
I am very new to all things RNA-seq, so please bear with me if the questions are really basic

I am trying to compare two things for differential expression.
The pipeline I am using is: Tophat -> Cuffdiff
(with newest versions of each, Tophat 2.04 and Cufflinks 2.02)
I am skipping running Cufflinks separately before Cuffdiff, because I'm not really interested in new gene/transcript discovery.
The problem is, when I try to run Cuffdiff, it quits with an error saying the reference annotation contains duplicate GFF IDs:
Code:
You are using Cufflinks v2.0.2, which is the most recent release. [16:22:49] Loading reference annotation. Error: duplicate GFF ID 'FBtr0100868' encountered!
However, when I searched this gff file, I didn't see duplicate lines containing this id, FBtr0100868.
Just to experiment though, I tried removing the lines containing the offending GFF id from the gff file, and running Cuffdiff again to see if it would fix the problem, but then it just had the same error with a different GFF id.
I tried doing this more times with each duplicate GFF id, but every time it just comes back with the same error and a different GFF id.
Has anyone else encountered this error using the gff file from Flybase, or anywhere else for that matter? I don't know if I'm doing the right thing by removing the "bad" IDs from the reference annotation either, especially since there seem to be an endless number of them. Is there any other way I should fix the reference annotation? Or would it be easier to just run Cufflinks and use its output gtf, instead of trying to fix the Flybase gff?
Any help would be very much appreciated!
Comment