Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • HTseq error

    Hi,

    anyone know how to solve these cigar type errors in HTseq? Seems ti is unusable until its solved..

    python -m HTSeq.scripts.count in_sorted.bam ref.gff -f bam -s no -t CDS -m union -i locus_tag

    Error occured when reading beginning of SAM/BAM file.
    Unknown CIGAR code '=' encountered.
    [Exception type: ValueError, raised in _HTSeq.pyx:1163]

    Or better yet, another progam to get raw counts from bam and gff files. Don't say 'bedtools' because that doesn't work either.. endlessly gives '0' as all the counts.

    Thanks,

    S.

  • #2
    featureCounts is much faster. I'm a little surprised that htseq-count was never updated to support the = and X cigar operators.

    Comment


    • #3
      which package is featureCounts from? I need to script, preferably in python, to handle a large number of bam files, so its hard to interact with R (can't script in R). I guess I will have to learn R!

      s.

      Comment


      • #4
        It seems your bam file is encoded with v1.4, there the CIGAR string distinguishes between matches and missmatches. You can use reformat.sh from the bbmap suite in order to convert from SAM1.4 to SAM1.3. See https://www.biostars.org/p/182156/#182160 and http://seqanswers.com/forums/showthread.php?t=46174
        Last edited by Michael.Ante; 04-12-2016, 12:28 AM. Reason: typo

        Comment


        • #5
          featureCounts is part of the subRead package, which is written in either java or C++ (I'd have to look). We script this in python as well. Since it's a regular executable it's not a problem to do.

          Comment


          • #6
            Originally posted by dpryan View Post
            featureCounts is part of the subRead package, which is written in either java or C++ (I'd have to look). We script this in python as well. Since it's a regular executable it's not a problem to do.
            It is written in C (not C++ or java).

            Comment


            • #7
              featureCounts works great. Thanks for the tip.

              Comment


              • #8
                Originally posted by shi View Post
                It is written in C (not C++ or java).
                @Wei: Any update on adding support for SAM v.1.4 tags to featureCounts?

                Comment


                • #9
                  Originally posted by shi View Post
                  It is written in C (not C++ or java).
                  I knew I should have looked this up... :P

                  Comment


                  • #10
                    Originally posted by GenoMax View Post
                    @Wei: Any update on adding support for SAM v.1.4 tags to featureCounts?
                    Hi @GenoMax, could you be more specific on what tags you'd like featureCounts to support?

                    Comment

                    Latest Articles

                    Collapse

                    • seqadmin
                      Strategies for Sequencing Challenging Samples
                      by seqadmin


                      Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                      03-22-2024, 06:39 AM
                    • seqadmin
                      Techniques and Challenges in Conservation Genomics
                      by seqadmin



                      The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                      Avian Conservation
                      Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                      03-08-2024, 10:41 AM

                    ad_right_rmr

                    Collapse

                    News

                    Collapse

                    Topics Statistics Last Post
                    Started by seqadmin, 03-27-2024, 06:37 PM
                    0 responses
                    13 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 03-27-2024, 06:07 PM
                    0 responses
                    11 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 03-22-2024, 10:03 AM
                    0 responses
                    53 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 03-21-2024, 07:32 AM
                    0 responses
                    69 views
                    0 likes
                    Last Post seqadmin  
                    Working...
                    X