Hello all,
Before running through the TopHat Cufflinks workflow with my own data, I am trying it with Drosophila_melanogaster RNA Seq data (as in "Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks"). Everything seemed to be working until I got to the Cuffmerge step. Executing the following command gave me the following errors.
cuffmerge -g genes.gtf -s genome.fa -p 8 assemblies.txt
[Thu Jun 13 11:10:38 2013] Beginning transcriptome assembly merge
-------------------------------------------
[Thu Jun 13 11:10:38 2013] Preparing output location ./merged_asm/
[Thu Jun 13 11:10:40 2013] Converting GTF files to SAM
[11:10:41] Loading reference annotation.
[11:10:41] Loading reference annotation.
[11:10:42] Loading reference annotation.
[11:10:43] Loading reference annotation.
[11:10:44] Loading reference annotation.
[11:10:45] Loading reference annotation.
[Thu Jun 13 11:10:46 2013] Quantitating transcripts
You are using Cufflinks v2.1.1, which is the most recent release.
Command line:
cufflinks -o ./merged_asm/ -F 0.05 -g genes.gtf -q --overhang-tolerance 200 --library-type=transfrags -A 0.0 --min-frags-per-transfrag 0 --no-5-extend -p 8 ./merged_asm/tmp/mergeSam_fileaFtxQb
[bam_header_read] EOF marker is absent.
[bam_header_read] invalid BAM binary header (this is not a BAM file).
File ./merged_asm/tmp/mergeSam_fileaFtxQb doesn't appear to be a valid BAM file, trying SAM...
[11:10:46] Loading reference annotation.
[11:10:49] Inspecting reads and determining fragment length distribution.
Processed 11337 loci.
> Map Properties:
> Normalized Map Mass: 69085.00
> Raw Map Mass: 69085.00
> Fragment Length Distribution: Truncated Gaussian (default)
> Default Mean: 200
> Default Std Dev: 80
[11:10:50] Assembling transcripts and estimating abundances.
Processed 11337 loci.
[Thu Jun 13 11:11:54 2013] Comparing against reference file genes.gtf
You are using Cufflinks v2.1.1, which is the most recent release.
No fasta index found for genome.fa. Rebuilding, please wait..
Fasta index rebuilt.
Warning: couldn't find fasta record for '2LHet'!
Warning: couldn't find fasta record for '2RHet'!
Warning: couldn't find fasta record for '3LHet'!
Warning: couldn't find fasta record for '3RHet'!
Warning: couldn't find fasta record for 'U'!
Warning: couldn't find fasta record for 'XHet'!
Warning: couldn't find fasta record for 'YHet'!
Warning: couldn't find fasta record for 'dmel_mitochondrion_genome'!
[Thu Jun 13 11:12:07 2013] Comparing against reference file genes.gtf
You are using Cufflinks v2.1.1, which is the most recent release.
Warning: couldn't find fasta record for '2LHet'!
Warning: couldn't find fasta record for '2RHet'!
Warning: couldn't find fasta record for '3LHet'!
Warning: couldn't find fasta record for '3RHet'!
Warning: couldn't find fasta record for 'U'!
Warning: couldn't find fasta record for 'XHet'!
Warning: couldn't find fasta record for 'YHet'!
Warning: couldn't find fasta record for 'dmel_mitochondrion_genome'!
Checking the genome.fa with head, I found that the problem did not seem to be with the fasta file.
jmwhitha@jmwhitha-OptiPlex-755:~/my_rnaseq_exp$ head genome.fa
>2L
CGACAATGCACGACAGAGGAAGCAGAACAGATATTTAGATTGCCTCTCATTTTCTCTCCCATATTATAGGGAGAAATATGATCGCGTATGCGAGAGTAGTGCCAACATATTGTGCTCTTTGATTTTTTGGCAACCCAAAATGGTGGCGGATGAACGAGATGATAATATATTCAAGTTGCCGCTAATCAGAAATAAATTCATTGCAACGTTAAATACAGCACAATATATGATCGCGTATGCGAGAGTAGTGCCAACATATTGTGCTAATGAGTGCCTCTCGTTCTCTGTCTTATATTACCGCAAACCCAAAAAGACAATACACGACAGAGAGAGAGAGCAGCGGAGATATTTAGATTGCCTATTAAATATGATCGCGTATGCGAGAGTAGTGCCAACATATTGTGCTCTCTATATAATGACTGCCTCTCATTCTGTCTTATTTTACCGCAAACCCAAATCGACAATGCACGACAGAGGAAGCAGAACAGATATTTAGATTGCCTCTCATTTTCTCTCCCATATTATAGGGAGAAATATGAT
This genome.fa came from http://cufflinks.cbcb.umd.edu/igenomes.html (Drosophila_melanogaster_Ensembl_BDGP5.25.tar.gz).
So what's the problem?
Thank you and God bless,
Jason
Before running through the TopHat Cufflinks workflow with my own data, I am trying it with Drosophila_melanogaster RNA Seq data (as in "Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks"). Everything seemed to be working until I got to the Cuffmerge step. Executing the following command gave me the following errors.
cuffmerge -g genes.gtf -s genome.fa -p 8 assemblies.txt
[Thu Jun 13 11:10:38 2013] Beginning transcriptome assembly merge
-------------------------------------------
[Thu Jun 13 11:10:38 2013] Preparing output location ./merged_asm/
[Thu Jun 13 11:10:40 2013] Converting GTF files to SAM
[11:10:41] Loading reference annotation.
[11:10:41] Loading reference annotation.
[11:10:42] Loading reference annotation.
[11:10:43] Loading reference annotation.
[11:10:44] Loading reference annotation.
[11:10:45] Loading reference annotation.
[Thu Jun 13 11:10:46 2013] Quantitating transcripts
You are using Cufflinks v2.1.1, which is the most recent release.
Command line:
cufflinks -o ./merged_asm/ -F 0.05 -g genes.gtf -q --overhang-tolerance 200 --library-type=transfrags -A 0.0 --min-frags-per-transfrag 0 --no-5-extend -p 8 ./merged_asm/tmp/mergeSam_fileaFtxQb
[bam_header_read] EOF marker is absent.
[bam_header_read] invalid BAM binary header (this is not a BAM file).
File ./merged_asm/tmp/mergeSam_fileaFtxQb doesn't appear to be a valid BAM file, trying SAM...
[11:10:46] Loading reference annotation.
[11:10:49] Inspecting reads and determining fragment length distribution.
Processed 11337 loci.
> Map Properties:
> Normalized Map Mass: 69085.00
> Raw Map Mass: 69085.00
> Fragment Length Distribution: Truncated Gaussian (default)
> Default Mean: 200
> Default Std Dev: 80
[11:10:50] Assembling transcripts and estimating abundances.
Processed 11337 loci.
[Thu Jun 13 11:11:54 2013] Comparing against reference file genes.gtf
You are using Cufflinks v2.1.1, which is the most recent release.
No fasta index found for genome.fa. Rebuilding, please wait..
Fasta index rebuilt.
Warning: couldn't find fasta record for '2LHet'!
Warning: couldn't find fasta record for '2RHet'!
Warning: couldn't find fasta record for '3LHet'!
Warning: couldn't find fasta record for '3RHet'!
Warning: couldn't find fasta record for 'U'!
Warning: couldn't find fasta record for 'XHet'!
Warning: couldn't find fasta record for 'YHet'!
Warning: couldn't find fasta record for 'dmel_mitochondrion_genome'!
[Thu Jun 13 11:12:07 2013] Comparing against reference file genes.gtf
You are using Cufflinks v2.1.1, which is the most recent release.
Warning: couldn't find fasta record for '2LHet'!
Warning: couldn't find fasta record for '2RHet'!
Warning: couldn't find fasta record for '3LHet'!
Warning: couldn't find fasta record for '3RHet'!
Warning: couldn't find fasta record for 'U'!
Warning: couldn't find fasta record for 'XHet'!
Warning: couldn't find fasta record for 'YHet'!
Warning: couldn't find fasta record for 'dmel_mitochondrion_genome'!
Checking the genome.fa with head, I found that the problem did not seem to be with the fasta file.
jmwhitha@jmwhitha-OptiPlex-755:~/my_rnaseq_exp$ head genome.fa
>2L
CGACAATGCACGACAGAGGAAGCAGAACAGATATTTAGATTGCCTCTCATTTTCTCTCCCATATTATAGGGAGAAATATGATCGCGTATGCGAGAGTAGTGCCAACATATTGTGCTCTTTGATTTTTTGGCAACCCAAAATGGTGGCGGATGAACGAGATGATAATATATTCAAGTTGCCGCTAATCAGAAATAAATTCATTGCAACGTTAAATACAGCACAATATATGATCGCGTATGCGAGAGTAGTGCCAACATATTGTGCTAATGAGTGCCTCTCGTTCTCTGTCTTATATTACCGCAAACCCAAAAAGACAATACACGACAGAGAGAGAGAGCAGCGGAGATATTTAGATTGCCTATTAAATATGATCGCGTATGCGAGAGTAGTGCCAACATATTGTGCTCTCTATATAATGACTGCCTCTCATTCTGTCTTATTTTACCGCAAACCCAAATCGACAATGCACGACAGAGGAAGCAGAACAGATATTTAGATTGCCTCTCATTTTCTCTCCCATATTATAGGGAGAAATATGAT
This genome.fa came from http://cufflinks.cbcb.umd.edu/igenomes.html (Drosophila_melanogaster_Ensembl_BDGP5.25.tar.gz).
So what's the problem?
Thank you and God bless,
Jason
Comment