Seqanswers Leaderboard Ad

**jdjax** · 08-07-2011, 05:00 AM

Hello --

When using cufflinks I receive the same error. I assembled my reads using Bowtie, converted my SAM output to BAM and then sorted the BAM file using SAMtools.

I would appreciate if someone could help. Thanks.

cufflinks: /usr/lib64/libz.so.1: no version information available (required by cufflinks)
You are using Cufflinks v1.0.3, which is the most recent release.
Warning: BAM header too large
File trinity_n_trial_inflorescence.sorted.bam doesn't appear to be a valid BAM file, trying SAM...

**DZhang** · 08-07-2011, 06:50 AM

mlox,

The general work flow is tophat, cufflinks, cuffcompare, and cuffdiff. If you follow the manuals of tophat and cufflinks, it should give you decent results.

Can you tell me how you came up with your current work flow?

**jdjax** · 08-07-2011, 07:09 AM

We currently do not have TopHat on our server and no one in our research group has any experience with TopHat.

I do not have a reference genome, so after the assembly was done; I used bowtie to align the reads from my various tissue samples to the bowtie index I made from the assembly. Bowtie's output is an unsorted SAM file. So using SAMtools I first convert the file to a BAM file and then I sort it. I then take the sorted BAM file as input for cufflinks I get this error.

I have tried using SAMtools reheader and that also did not work. Any other suggestions would be helpful.

**DZhang** · 08-07-2011, 07:49 AM

Hi Jdjax,

My experience indicates it is in general less challenging to use the work flow recommended by the author(s). (I am aware that cufflinks support bam files generated by programs other than tophat but in your case it complains that your bam file is not valid.)

In your case, the nice part about tophat is two folds: 1) you can download the binary to your home directory and use it directly ; 2) tophat uses bowtie to align so it can re-use your index files. You may pursue fixing the header complaint or try tophat, whichever can achieve your objectives.

**jdjax** · 08-07-2011, 08:40 AM

Thanks for your input DZhang. Do you know of any other dependencies besides Bowtie that are required for TopHat?

**DZhang** · 08-07-2011, 10:51 AM

Not that I am aware of. I believe you will get the results faster if you go with tophat.

**mlox** · 08-08-2011, 07:26 AM

Hi DZhang,
As I haven't a reference genome file I used a transcriptome assembly for mapping, I thought about no need to cufflnks and cuffmerge. I just generated a gtf file by my own, as all my reads came from spliced exons.
I guess the error message results from the large number of transcripts I mapped to. I also tried bwa and for mapping and got a similar error.

**DZhang** · 08-08-2011, 08:11 AM

Hi mlox,

In your case, I strongly recommend using a count-based method. (If possible, I would also recommend mapping the reads to a genome, not a transcriptome.) My pick is to use HT-seq to obtain the read counts and use DESeq to identify differentially expressed genes.

**jdjax** · 08-10-2011, 05:57 AM

DZhang,

I installed TopHat and tired using the accepted_hits.bam output from TopHat in cufflinks. But I received the same error: BAM header too large.

Do you have any other suggestions on what I can do?

Thanks.

**DZhang** · 08-10-2011, 06:46 AM

jdjax,

Did you sort the sam? The bam file produced by Tophat should be used as is. Please also post your cufflink command.

**jdjax** · 08-10-2011, 09:46 AM

DZhang,

I did not sort the sam. I am just testing these programs out so I did not use any options for tophat or cufflinks. Tophat made a file accept_hits.bam. I used that file as input for the cufflinks.

My cufflinks command was just: cufflinks accepted_hits.bam

I also want to more descriptive about errors I am recieveing in the hopes of figuring this problem. This is what the error stated:

cufflinks: /usr/lib64/libz.so.1 : no version information available
Warning: BAM header too large
File accepted_hits does not appear to be a valid BAM file, trying SAM
Inspecting reads and determining fragment length distribution.
SAM error on line 2880: CIGAR op has zero length
SAM error on line 3240: CIGAR op has zero length
SAM error on line 3464: CIGAR op has zero length
SAM error on line 5063: CIGAR op has zero length
SAM error on line 30750: CIGAR op has zero length
SAM error on line 51722: CIGAR op has zero length

This continues with increasing line numbers until it reaches the end of the file.
I have also checked /usr/lib64/libz.so.1 and it is in /usr/lib64

libz.so.1 -> libz.so.1.2.3

is what is present in on the server.

Again thanks for your input. I appreciate any help. =)

**DZhang** · 08-10-2011, 10:48 AM

Hi jdjax,

1) Can you provide some background about your project? Type of reads, type of reference sequence, etc.
2) Tophat requires one mandatory parameter besides the read file(s). See below: -r/--mate-inner-dist <int> This is the expected (mean) inner distance between mate pairs. For, example, for paired end runs with fragments selected at 300bp, where each end is 50bp, you should set -r to be 200. There is no default, and this parameter is required for paired end runs.

How did you set -r ?

**DZhang** · 08-10-2011, 10:59 AM

Hi jdjax,

You should check the header information of your bam file. One way to do it is to convert bam to sam using samtools, then check the top portion of the sam files. (e.g., using 'more your.sam'). Let us know what you see in the header.

**jdjax** · 08-10-2011, 11:04 AM

DZhang,

These are 50 to 200 bp single reads and the reference sequence I am using is the fasta file of contigs I got from the trinity assembly. This is for a de novo project, I do not have a full reference genome. Because of the fact that I do not have a reference genome is why I wanted to just use Bowtie, I did not think that TopHat was necessary since I do not have a full genome.

The option -r is only required for paired end runs.

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 27 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 31 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 27 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 52 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

BAM header too large using cuffdiff

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News