hello,
I'm trying to run cuffdiff with a reference consisting of the results of a de novo assembly. however, untill now, all my attempts produced only empty cds-files.
from what i understood so far, the problem seems to be a missing p_id attribute in my merged files.
To solve this, most people seem to recomend using the -s parameter for providing the reference while using cuffmerge/cuffcompare. However, this has not worked for my so far.
It seems both commands are not able to find the reference, even though I explicitly specified the path. The exact command log is:
cuffmerge -s /m/scratch/herc/NGS/Guignardia/abyss_assembly_results/k51/SSPACE/cufflinks_reference/standard_output.final.scaffolds.fa assemblies
[Tue Jan 10 11:12:01 2012] Beginning transcriptome assembly merge
-------------------------------------------
[Tue Jan 10 11:12:01 2012] Preparing output location ./merged_asm/
Warning: no reference GTF provided!
[Tue Jan 10 11:12:02 2012] Converting GTF files to SAM
gtf_to_sam: /lib64/libz.so.1: no version information available (required by gtf_to_sam)
[11:12:02] Loading reference annotation.
gtf_to_sam: /lib64/libz.so.1: no version information available (required by gtf_to_sam)
[11:12:03] Loading reference annotation.
[Tue Jan 10 11:12:04 2012] Assembling transcripts
cufflinks: /lib64/libz.so.1: no version information available (required by cufflinks)
Warning: Your version of Cufflinks is not up-to-date. It is recommended that you upgrade to Cufflinks v1.3.0 to benefit from the most recent features and bug fixes (http://cufflinks.cbcb.umd.edu).
[bam_header_read] EOF marker is absent.
[bam_header_read] invalid BAM binary header (this is not a BAM file).
File ./merged_asm/tmp/mergeSam_fileZTsqZ3 doesn't appear to be a valid BAM file, trying SAM...
[11:12:04] Inspecting reads and determining fragment length distribution.
Processed 8877 loci.
> Map Properties:
> Total Map Mass: 26001.00
> Fragment Length Distribution: Truncated Gaussian (default)
> Default Mean: 200
> Default Std Dev: 80
[11:12:05] Assembling transcripts and estimating abundances.
Processed 8877 loci.
[Tue Jan 10 11:12:54 2012] Comparing against reference file None
Warning: Your version of Cufflinks is not up-to-date. It is recommended that you upgrade to Cufflinks v1.3.0 to benefit from the most recent features and bug fixes (http://cufflinks.cbcb.umd.edu).
[Tue Jan 10 11:13:00 2012] Comparing against reference file None
Warning: Your version of Cufflinks is not up-to-date. It is recommended that you upgrade to Cufflinks v1.3.0 to benefit from the most recent features and bug fixes (http://cufflinks.cbcb.umd.edu).
for cuffmerge. The resulting file does not have the p_id attribute, so I tried using cuffcompare to deal with this:
cuffcompare -s /m/scratch/herc/NGS/Guignardia/abyss_assembly_results/k51/SSPACE/cufflinks_reference/standard_output.final.scaffolds.fa -V merged.gtf merged.gtf
Warning: Your version of Cufflinks is not up-to-date. It is recommended that you upgrade to Cufflinks v1.3.0 to benefit from the most recent features and bug fixes (http://cufflinks.cbcb.umd.edu).
Prefix for output files: cuffcmp
Processing qfile #1: merged.gtf
Loading transcripts from merged.gtf..
Processing qfile #2: merged.gtf
Loading transcripts from merged.gtf..
Tracking transcripts across 2 query files..
Cleaning up..
Done.
However, still no p_ID attribute. Could it be a problem that my reference file is an assembly result containing a huge number of contigs?
EDIT: I just read that in order to produce p_ids, i need a reference annotation that includes cds records. From where could I get such a reference?
thank you for your answers
I'm trying to run cuffdiff with a reference consisting of the results of a de novo assembly. however, untill now, all my attempts produced only empty cds-files.
from what i understood so far, the problem seems to be a missing p_id attribute in my merged files.
To solve this, most people seem to recomend using the -s parameter for providing the reference while using cuffmerge/cuffcompare. However, this has not worked for my so far.
It seems both commands are not able to find the reference, even though I explicitly specified the path. The exact command log is:
cuffmerge -s /m/scratch/herc/NGS/Guignardia/abyss_assembly_results/k51/SSPACE/cufflinks_reference/standard_output.final.scaffolds.fa assemblies
[Tue Jan 10 11:12:01 2012] Beginning transcriptome assembly merge
-------------------------------------------
[Tue Jan 10 11:12:01 2012] Preparing output location ./merged_asm/
Warning: no reference GTF provided!
[Tue Jan 10 11:12:02 2012] Converting GTF files to SAM
gtf_to_sam: /lib64/libz.so.1: no version information available (required by gtf_to_sam)
[11:12:02] Loading reference annotation.
gtf_to_sam: /lib64/libz.so.1: no version information available (required by gtf_to_sam)
[11:12:03] Loading reference annotation.
[Tue Jan 10 11:12:04 2012] Assembling transcripts
cufflinks: /lib64/libz.so.1: no version information available (required by cufflinks)
Warning: Your version of Cufflinks is not up-to-date. It is recommended that you upgrade to Cufflinks v1.3.0 to benefit from the most recent features and bug fixes (http://cufflinks.cbcb.umd.edu).
[bam_header_read] EOF marker is absent.
[bam_header_read] invalid BAM binary header (this is not a BAM file).
File ./merged_asm/tmp/mergeSam_fileZTsqZ3 doesn't appear to be a valid BAM file, trying SAM...
[11:12:04] Inspecting reads and determining fragment length distribution.
Processed 8877 loci.
> Map Properties:
> Total Map Mass: 26001.00
> Fragment Length Distribution: Truncated Gaussian (default)
> Default Mean: 200
> Default Std Dev: 80
[11:12:05] Assembling transcripts and estimating abundances.
Processed 8877 loci.
[Tue Jan 10 11:12:54 2012] Comparing against reference file None
Warning: Your version of Cufflinks is not up-to-date. It is recommended that you upgrade to Cufflinks v1.3.0 to benefit from the most recent features and bug fixes (http://cufflinks.cbcb.umd.edu).
[Tue Jan 10 11:13:00 2012] Comparing against reference file None
Warning: Your version of Cufflinks is not up-to-date. It is recommended that you upgrade to Cufflinks v1.3.0 to benefit from the most recent features and bug fixes (http://cufflinks.cbcb.umd.edu).
for cuffmerge. The resulting file does not have the p_id attribute, so I tried using cuffcompare to deal with this:
cuffcompare -s /m/scratch/herc/NGS/Guignardia/abyss_assembly_results/k51/SSPACE/cufflinks_reference/standard_output.final.scaffolds.fa -V merged.gtf merged.gtf
Warning: Your version of Cufflinks is not up-to-date. It is recommended that you upgrade to Cufflinks v1.3.0 to benefit from the most recent features and bug fixes (http://cufflinks.cbcb.umd.edu).
Prefix for output files: cuffcmp
Processing qfile #1: merged.gtf
Loading transcripts from merged.gtf..
Processing qfile #2: merged.gtf
Loading transcripts from merged.gtf..
Tracking transcripts across 2 query files..
Cleaning up..
Done.
However, still no p_ID attribute. Could it be a problem that my reference file is an assembly result containing a huge number of contigs?
EDIT: I just read that in order to produce p_ids, i need a reference annotation that includes cds records. From where could I get such a reference?
thank you for your answers
Comment