Announcement

Collapse

Welcome to the New Seqanswers!

Welcome to the new Seqanswers! We'd love your feedback, please post any you have to this topic: New Seqanswers Feedback.
See more
See less

HTseq error

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • HTseq error

    Hi,

    anyone know how to solve these cigar type errors in HTseq? Seems ti is unusable until its solved..

    python -m HTSeq.scripts.count in_sorted.bam ref.gff -f bam -s no -t CDS -m union -i locus_tag

    Error occured when reading beginning of SAM/BAM file.
    Unknown CIGAR code '=' encountered.
    [Exception type: ValueError, raised in _HTSeq.pyx:1163]

    Or better yet, another progam to get raw counts from bam and gff files. Don't say 'bedtools' because that doesn't work either.. endlessly gives '0' as all the counts.

    Thanks,

    S.

  • #2
    featureCounts is much faster. I'm a little surprised that htseq-count was never updated to support the = and X cigar operators.

    Comment


    • #3
      which package is featureCounts from? I need to script, preferably in python, to handle a large number of bam files, so its hard to interact with R (can't script in R). I guess I will have to learn R!

      s.

      Comment


      • #4
        It seems your bam file is encoded with v1.4, there the CIGAR string distinguishes between matches and missmatches. You can use reformat.sh from the bbmap suite in order to convert from SAM1.4 to SAM1.3. See https://www.biostars.org/p/182156/#182160 and http://seqanswers.com/forums/showthread.php?t=46174
        Last edited by Michael.Ante; 04-12-2016, 12:28 AM. Reason: typo

        Comment


        • #5
          featureCounts is part of the subRead package, which is written in either java or C++ (I'd have to look). We script this in python as well. Since it's a regular executable it's not a problem to do.

          Comment


          • #6
            Originally posted by dpryan View Post
            featureCounts is part of the subRead package, which is written in either java or C++ (I'd have to look). We script this in python as well. Since it's a regular executable it's not a problem to do.
            It is written in C (not C++ or java).

            Comment


            • #7
              featureCounts works great. Thanks for the tip.

              Comment


              • #8
                Originally posted by shi View Post
                It is written in C (not C++ or java).
                @Wei: Any update on adding support for SAM v.1.4 tags to featureCounts?

                Comment


                • #9
                  Originally posted by shi View Post
                  It is written in C (not C++ or java).
                  I knew I should have looked this up... :P

                  Comment

                  Working...
                  X