Seqanswers Leaderboard Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • rossh
    Junior Member
    • Nov 2011
    • 2

    Cufflinks, BAM header problem solved... for the moment

    Hi All,

    We were getting this error running cufflinks with a bam file

    [ross@bioinfo tophat_test]$ cufflinks accepted_hits.bam
    You are using Cufflinks v1.1.0, which is the most recent release.
    Warning: BAM header too large
    File accepted_hits.bam doesn't appear to be a valid BAM file, trying SAM...
    [14:16:27] Inspecting reads and determining fragment length distribution.
    SAM error on line 1678: CIGAR op has zero length
    SAM error on line 2017: CIGAR op has zero length
    SAM error on line 2025: CIGAR op has zero length
    ...............
    ..........
    ....

    I have edited the hits.cpp file (line 590) in the source files from:

    static const unsigned MAX_HEADER_LEN = 4 * 1024 * 1024; // 4 MB
    to
    static const unsigned MAX_HEADER_LEN = 7 * 1024 * 1024; // 7 MB

    and ran "make" again. It seems to have fixed the problem.

    Would appreciate if anyone has a link to the BAM file format. This edit may cause a problem later on. Had a bit of a hunt around, but can't see it.

    Thanks

    Ross
  • BAMseek
    Senior Member
    • Apr 2011
    • 124

    #2
    A description of the SAM/BAM file format can be found here



    The BAM details start at section 3.

    Having a BAM header larger than 4 MB seems a bit odd, but I guess not impossible, especially if you have a lot of reference sequences. You could do a "samtools view -H", where -H means output header only, to see if the header is indeed larger than 4 MB.

    Justin

    Comment

    • rossh
      Junior Member
      • Nov 2011
      • 2

      #3
      Thanks Justin,
      I will take a look.

      Ross

      Comment

      • BAMseek
        Senior Member
        • Apr 2011
        • 124

        #4
        I was able to get the same type of warning messages as you (Cufflinks complains that the BAM file does not appear to be in the correct format and there are warnings about invalid or 0 length cigar operations).

        When I do a sam reheader, the warning messages go away and Cufflinks seems to operate just fine (I just pull off the header of the BAM file and give it back to itself). I know there is some redundancy with how BAM files store sequence name and length, so maybe there is something going on there.

        Justin

        Comment

        • tboothby
          Member
          • May 2011
          • 56

          #5
          Hi,
          I am having a similar problem. I run tophat/bowtie on my reads to generate an accepted_hits.bam file.

          When I try to run that file through cufflinks I get the following:
          .
          .
          .
          SAM error on line 25557292: CIGAR op has zero length
          SAM error on line 25596829: CIGAR op has zero length
          SAM error on line 25604145: invalid CIGAR operation
          SAM error on line 25612881: CIGAR op has zero length
          SAM error on line 25618288: CIGAR op has zero length
          > Processed 0 loci. [*************************] 100%


          I am sorry that I don't really follow how you guys (above) fixed this problem. It seems that having a lot of references can cause this problem (I have a lot! I am using a reference transcriptome to make other RNAseq data). Could somebody please explain in a bit more detail how they fixed their problem.

          Cheers,
          T

          Comment

          • BAMseek
            Senior Member
            • Apr 2011
            • 124

            #6
            This seemed to work for me, although I am not quite sure why . . .

            If A.bam is the problem file, then

            samtools view -H A.bam > header.sam
            samtools reheader header.sam A.bam > B.bam
            Now try running the Cufflinks analysis on B.bam (alternatively, you could create a SAM file and run Cufflinks on that). Both ways seemed to remove those warnings, at least in my case.

            Justin

            Comment

            • BAMseek
              Senior Member
              • Apr 2011
              • 124

              #7
              Also, to add to that - my header is less than 4 MB. Cufflinks still has a max header length of 4 MB, so if your header is larger than 4 MB, then you might need to do as rossh suggested and increase the max length and recompile the code. If this is the case, then you should get the warning "Warning: BAM header is too large".

              You might be able to instead run the analysis on a SAM file, as it does not appear that there is a maximum header length for a SAM file.

              Comment

              • tboothby
                Member
                • May 2011
                • 56

                #8
                @BAM

                Thanks, I converted .bam > .sam and running that now. So far I have not encountered the problem.

                Cheers,
                T

                Comment

                • BAMseek
                  Senior Member
                  • Apr 2011
                  • 124

                  #9
                  @tboothby - glad that did the trick.

                  Here is what I think is going on . . .

                  For me, the BAM file had sequence and length information but not a physical header section (there are potentially two places where BAM files store sequence and length info - a somewhat less attractive feature of the format), so Cufflinks was not getting the sequence and length information. For you, I'm guessing that the header was longer than 4 MB, which the Cufflinks BAM parser can't handle. In either case, Cufflinks was not able to get the full header information and parse the BAM file correctly. Looks like operating on the SAM file solves both of those problems.

                  Justin

                  Comment

                  • AsoBioInfo
                    Member
                    • Dec 2011
                    • 37

                    #10
                    CIGAR op has zero length

                    Edited: "PROBLEM SOLVED"

                    Hello all,

                    I am encountering the same error, "CIGAR op has zero length" when I ran the following command:

                    ./cufflinks accepted_hits.bam

                    It will display the error in several lines. I also converted the bam file to sam file and tried to run cufflinks on that sam file but it displays an error message,

                    "AS attribute not supported"

                    I also tried to change the header as it is mentioned in the thread but encountering errors. I also ran cuuflinks on the sorted .bam file but no success.

                    When the cufflinks is ran on the test_data, it will give .expr and .gtf files indicating that cufflinks is working (but only for test data )

                    I ran cufflinks on Galaxy (web-server), it ran successfully on accepted_hits.bam but command lines gives more flexibility and options and that's why I am more interested in it.

                    Hopefully the problem will be sorted

                    Thanks!
                    Last edited by AsoBioInfo; 05-07-2012, 05:36 AM. Reason: "Problem Solved"

                    Comment

                    Latest Articles

                    Collapse

                    • seqadmin
                      New Genomics Tools and Methods Shared at AGBT 2025
                      by seqadmin


                      This year’s Advances in Genome Biology and Technology (AGBT) General Meeting commemorated the 25th anniversary of the event at its original venue on Marco Island, Florida. While this year’s event didn’t include high-profile musical performances, the industry announcements and cutting-edge research still drew the attention of leading scientists.

                      The Headliner
                      The biggest announcement was Roche stepping back into the sequencing platform market. In the years since...
                      03-03-2025, 01:39 PM
                    • seqadmin
                      Investigating the Gut Microbiome Through Diet and Spatial Biology
                      by seqadmin




                      The human gut contains trillions of microorganisms that impact digestion, immune functions, and overall health1. Despite major breakthroughs, we’re only beginning to understand the full extent of the microbiome’s influence on health and disease. Advances in next-generation sequencing and spatial biology have opened new windows into this complex environment, yet many questions remain. This article highlights two recent studies exploring how diet influences microbial...
                      02-24-2025, 06:31 AM

                    ad_right_rmr

                    Collapse

                    News

                    Collapse

                    Topics Statistics Last Post
                    Started by seqadmin, 03-20-2025, 05:03 AM
                    0 responses
                    17 views
                    0 reactions
                    Last Post seqadmin  
                    Started by seqadmin, 03-19-2025, 07:27 AM
                    0 responses
                    18 views
                    0 reactions
                    Last Post seqadmin  
                    Started by seqadmin, 03-18-2025, 12:50 PM
                    0 responses
                    19 views
                    0 reactions
                    Last Post seqadmin  
                    Started by seqadmin, 03-03-2025, 01:15 PM
                    0 responses
                    185 views
                    0 reactions
                    Last Post seqadmin  
                    Working...