i have some sam files of Tophat and i want to assemble them to transcripts. should i merge all the sam files before running cufflinks or run cuffmerge after running cufflinkds for each sam file seperately?
Announcement
Collapse
No announcement yet.
X
-
Header problems... cuffmerge
Hi all,
I'm getting a header error when using Cuffmerge, and I was wondering if anyone has been dealing with this recently (I'm using cufflinks v1.1.0/Sept 2011). I'm getting the following error:
[15:38:37] Inspecting reads and determining fragment length distribution.
Error: this SAM file doesn't appear to be correctly sorted!
current hit is at Contig_100_consensus_sequence:0, last one was at contig09241:34
Cufflinks requires that if your file has SQ records in
the SAM header that they appear in the same order as the chromosomes names
in the alignments.
If there are no SQ records in the header, or if the header is missing,
the alignments must be sorted lexicographically by chromosome
name and by position.
Is this still the same problem in the script? Or could this error message come about if there is anything wrong with the data file? I've tried the fixes described here and it still runs with the same error.
I'm generating my .sam file from bowtie2 and running them through cufflinks with no problems- and this is occuring for both 454 and Illumina generated data. Oh, and I don't have a reference gtf file.
Thanks!
-Alice
Comment
-
Hi, aliceb
Yes, I have met this issue,too. What i did is to sort the merged bam file(or sam file,I don't remember) generated by cuffmerge and then run cufflinks. But it is not a good idea since we may have to sort the file every time we run cuffmerge. So I wonder if there is any better way?
Comment
-
Hmmm, it seems like some people are having trouble getting a cuffmerge file to run in later analyses, while others are having trouble getting cuffmerge to execute at all.
I'm getting this header problem while running cuffmerge, so there is no output file to work with. Is anyone else getting stuck here?
Cheers,
Alice
Comment
-
Cuffmerge problems...
I'm stucked here too...
My reference file is a .gff not a .gtf. Is that a problem?
I ran the tophat pipeline until here without errors...but in cuffmerge I got this:
cuffmerge -g Triha.gff -s Triha.fa -p 12 assemblies.txt
[Tue Oct 2 09:32:48 2012] Beginning transcriptome assembly merge
-------------------------------------------
[Tue Oct 2 09:32:48 2012] Preparing output location ./merged_asm/
[Tue Oct 2 09:32:49 2012] Converting GTF files to SAM
gtf_to_sam: /lib64/libz.so.1: no version information available (required by gtf_to_sam)
[09:32:49] Loading reference annotation.
gtf_to_sam: /lib64/libz.so.1: no version information available (required by gtf_to_sam)
[09:32:49] Loading reference annotation.
gtf_to_sam: /lib64/libz.so.1: no version information available (required by gtf_to_sam)
[09:32:49] Loading reference annotation.
gtf_to_sam: /lib64/libz.so.1: no version information available (required by gtf_to_sam)
[09:32:50] Loading reference annotation.
gtf_to_sam: /lib64/libz.so.1: no version information available (required by gtf_to_sam)
[09:32:50] Loading reference annotation.
gtf_to_sam: /lib64/libz.so.1: no version information available (required by gtf_to_sam)
[09:32:51] Loading reference annotation.
[Tue Oct 2 09:32:51 2012] Quantitating transcripts
cufflinks: /lib64/libz.so.1: no version information available (required by cufflinks)
You are using Cufflinks v2.0.2, which is the most recent release.
Command line:
cufflinks -o ./merged_asm/ -F 0.05 -g Triha.gff -q --overhang-tolerance 200 --library-type=transfrags -A 0.0 --min-frags-per-transfrag 0 --no-5-extend -p 12 ./merged_asm/tmp/mergeSam_fileO6g28c
[bam_header_read] EOF marker is absent.
[bam_header_read] invalid BAM binary header (this is not a BAM file).
File ./merged_asm/tmp/mergeSam_fileO6g28c doesn't appear to be a valid BAM file, trying SAM...
[09:32:52] Loading reference annotation.
[FAILED]
Error: could not execute cufflinks
Traceback (most recent call last):
File "/usr/local/bin/cuffmerge", line 576, in ?
sys.exit(main())
File "/usr/local/bin/cuffmerge", line 558, in main
cufflinks(output_dir, merged_sam_filename, params.min_isoform_frac, params.ref_gtf)
File "/usr/local/bin/cuffmerge", line 198, in cufflinks
exit(1)
TypeError: 'str' object is not callable
Any suggestion??
Comment
-
Cuffmerge returns duplicated @SQ <nul><nul> / blankspaces in mergesam
Using cufflinks on transcript.gtf files from over 90 samples I received exactly the same error as described in the initial post.
I found that in the temporary directory for cuffmerge there is a mergesam file, and in my case there was a problem with one of the @SQ. I had a duplicated @SQ header for one of my chromosomes...
displayed as:
@SQ SN:C07 LN: 38762999
...
@SQ SN:C07 LN: 50454407
in nedit:
@SQ SN:<nul><nul>...<nul><nul>C07 LN: 38762999
...
@SQ SN:C07 LN: 50454407
I solved this by recursively removing one of my 90 transcript.gtf files from the input. Once it was solved by removing my fifth sample, and now cuffmerge insists that the first sample is removed.
I can not find any errors in the cufflinks gtf files or in the Bowtie references which were used as input, and it to me this problem shows no consistency except that the problem always occurs with C07... Maybe this is some kind of bug in cuffmerge?
cuffmerge -s ../../../out/bwtIndex/bwtRef2.fa -g ../../../out/bwtIndex/bwtRef2.gff -o . -p 4 ../../../out/cuffmerge/2/assembly-manifest.txt 1> ../log/cuffmerge_2cuffmerge.log 2> ../err/cuffmerge_2cuffmerge.err
cufflinks v1.3.0
Comment
-
I have the exact same problem as the OP.
I'm using the GTF downloaded from the ensembl FTP to merge with 6 transcripts.gtf files produced by cufflinks.
I tried both solutions:
1) LC_ALL=C;EXPORT LC_ALL
2) changing the code in the cuffmerge script
neither worked, I'm getting the exact same error. Any other ideas out there?
Comment
Latest Articles
Collapse
-
by seqadmin
At the intersection of cytogenetics and genomics lies the exciting field of cytogenomics. It focuses on studying chromosomes at a molecular scale, involving techniques that analyze either the whole genome or particular DNA sequences to examine variations in structure and behavior at the chromosomal or subchromosomal level. By integrating cytogenetic techniques with genomic analysis, researchers can effectively investigate chromosomal abnormalities related to diseases, particularly...-
Channel: Articles
09-26-2023, 06:26 AM -
-
by seqadmin
Cancer research has been transformed through numerous molecular techniques, with RNA sequencing (RNA-seq) playing a crucial role in understanding the complexity of the disease. Maša Ivin, Ph.D., Scientific Writer at Lexogen, and Yvonne Goepel Ph.D., Product Manager at Lexogen, remarked that “The high-throughput nature of RNA-seq allows for rapid profiling and deep exploration of the transcriptome.” They emphasized its indispensable role in cancer research, aiding in biomarker...-
Channel: Articles
09-07-2023, 11:15 PM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, Yesterday, 09:36 AM
|
0 responses
8 views
0 likes
|
Last Post
by seqadmin
Yesterday, 09:36 AM
|
||
Started by seqadmin, 10-02-2023, 07:14 AM
|
0 responses
15 views
0 likes
|
Last Post
by seqadmin
10-02-2023, 07:14 AM
|
||
Started by seqadmin, 09-29-2023, 09:38 AM
|
0 responses
16 views
0 likes
|
Last Post
by seqadmin
09-29-2023, 09:38 AM
|
||
Started by seqadmin, 09-27-2023, 06:57 AM
|
0 responses
16 views
0 likes
|
Last Post
by seqadmin
09-27-2023, 06:57 AM
|
Comment