I am very much a newbie in this field, but starting up as a part of my PhD project is to analyze data i have generated from 14 CHO DUKX strains. Right now I am just setting up the system and has so far found the CHO K1 genome and gff3 files.
But when i check the annotation file by the cufflink tool gffread it outputs
csrk@davinci:~/genome> gffread -E CHOK1.gff3 -o- | more
Error: duplicate GFF ID 'rna1362' encountered!
Error: duplicate GFF ID 'rna2220' encountered!
Error: duplicate GFF ID 'rna2366' encountered!
Error: duplicate GFF ID 'rna3249' encountered!
Error: duplicate GFF ID 'rna7611' encountered!
Error: duplicate GFF ID 'rna7612' encountered!
Error: duplicate GFF ID 'rna7613' encountered!
Error: duplicate GFF ID 'rna7615' encountered!
Error: duplicate GFF ID 'rna7626' encountered!
Error: duplicate GFF ID 'rna9416' encountered!
Error: duplicate GFF ID 'rna11285' encountered!
Error: duplicate GFF ID 'rna11287' encountered!
Error: duplicate GFF ID 'rna15111' encountered!
Error: duplicate GFF ID 'rna18225' encountered!
Error: duplicate GFF ID 'rna18229' encountered!
Error: duplicate GFF ID 'rna18230' encountered!
Error: duplicate GFF ID 'rna18861' encountered!
Error: duplicate GFF ID 'rna18862' encountered!
Error: duplicate GFF ID 'rna19473' encountered!
Error: duplicate GFF ID 'rna19903' encountered!
Error: duplicate GFF ID 'rna20504' encountered!
when I search for one of the RNA's it outputs more than 500 lines containing more than 10 different gene-names.
What needs to be changes in the gff3 file before cufflinks (or e.g. HTseq) will be able to use it?
output search for rna1362 is attached
Merry Christmas
But when i check the annotation file by the cufflink tool gffread it outputs
csrk@davinci:~/genome> gffread -E CHOK1.gff3 -o- | more
Error: duplicate GFF ID 'rna1362' encountered!
Error: duplicate GFF ID 'rna2220' encountered!
Error: duplicate GFF ID 'rna2366' encountered!
Error: duplicate GFF ID 'rna3249' encountered!
Error: duplicate GFF ID 'rna7611' encountered!
Error: duplicate GFF ID 'rna7612' encountered!
Error: duplicate GFF ID 'rna7613' encountered!
Error: duplicate GFF ID 'rna7615' encountered!
Error: duplicate GFF ID 'rna7626' encountered!
Error: duplicate GFF ID 'rna9416' encountered!
Error: duplicate GFF ID 'rna11285' encountered!
Error: duplicate GFF ID 'rna11287' encountered!
Error: duplicate GFF ID 'rna15111' encountered!
Error: duplicate GFF ID 'rna18225' encountered!
Error: duplicate GFF ID 'rna18229' encountered!
Error: duplicate GFF ID 'rna18230' encountered!
Error: duplicate GFF ID 'rna18861' encountered!
Error: duplicate GFF ID 'rna18862' encountered!
Error: duplicate GFF ID 'rna19473' encountered!
Error: duplicate GFF ID 'rna19903' encountered!
Error: duplicate GFF ID 'rna20504' encountered!
when I search for one of the RNA's it outputs more than 500 lines containing more than 10 different gene-names.
What needs to be changes in the gff3 file before cufflinks (or e.g. HTseq) will be able to use it?
output search for rna1362 is attached
Merry Christmas