I was just using cufflinks and cuffcompare for the first time. When I ran cuffcompare with the -r option and the Ensembl human genome gtf file, I got zero matches between my sequences and Ensembl. I did notice that my data used a prefix of 'chr' for the chromosome names, so I edited the ensembl gtf file to match that, but I still got no matches.
I also noticed some discrepancies in the cufflinks output from what it is supposed to be, perhaps this is the cause of the lack of reference matches. Here are the discrepancies that I notice:
genes.expr:
The header has 8 columns, but the output only has 6. The last column is a real number, and I'm guessing it's the RPKM value, but the last three column headers are bundle_fraction, density, and RPKM, so it could be any of those three.
transcripts.expr:
The header has 13 columns, but the output has 14. The last column contains an integer value. The other columns look like they contain the correct data type based on the column name, so I'm ignoring that 14th column.
For the cuffcompare output, the transcripts.refmap files are empty except for the column header due to the lack of matches. The transcripts.tmap file has 10 columns in the header, but 12 columns in the data. In this case, I can tell that the missing column header values are conf_low and conf_hi which should be in between RPKM and cov.
The version of the software I'm using is:
cufflinks-0.7.0.OSX_x86_64
Thanks for your help.
Gene
I also noticed some discrepancies in the cufflinks output from what it is supposed to be, perhaps this is the cause of the lack of reference matches. Here are the discrepancies that I notice:
genes.expr:
The header has 8 columns, but the output only has 6. The last column is a real number, and I'm guessing it's the RPKM value, but the last three column headers are bundle_fraction, density, and RPKM, so it could be any of those three.
transcripts.expr:
The header has 13 columns, but the output has 14. The last column contains an integer value. The other columns look like they contain the correct data type based on the column name, so I'm ignoring that 14th column.
For the cuffcompare output, the transcripts.refmap files are empty except for the column header due to the lack of matches. The transcripts.tmap file has 10 columns in the header, but 12 columns in the data. In this case, I can tell that the missing column header values are conf_low and conf_hi which should be in between RPKM and cov.
The version of the software I'm using is:
cufflinks-0.7.0.OSX_x86_64
Thanks for your help.
Gene
Comment