Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • HTseq error

    Hi,

    anyone know how to solve these cigar type errors in HTseq? Seems ti is unusable until its solved..

    python -m HTSeq.scripts.count in_sorted.bam ref.gff -f bam -s no -t CDS -m union -i locus_tag

    Error occured when reading beginning of SAM/BAM file.
    Unknown CIGAR code '=' encountered.
    [Exception type: ValueError, raised in _HTSeq.pyx:1163]

    Or better yet, another progam to get raw counts from bam and gff files. Don't say 'bedtools' because that doesn't work either.. endlessly gives '0' as all the counts.

    Thanks,

    S.

  • #2
    featureCounts is much faster. I'm a little surprised that htseq-count was never updated to support the = and X cigar operators.

    Comment


    • #3
      which package is featureCounts from? I need to script, preferably in python, to handle a large number of bam files, so its hard to interact with R (can't script in R). I guess I will have to learn R!

      s.

      Comment


      • #4
        It seems your bam file is encoded with v1.4, there the CIGAR string distinguishes between matches and missmatches. You can use reformat.sh from the bbmap suite in order to convert from SAM1.4 to SAM1.3. See https://www.biostars.org/p/182156/#182160 and http://seqanswers.com/forums/showthread.php?t=46174
        Last edited by Michael.Ante; 04-12-2016, 12:28 AM. Reason: typo

        Comment


        • #5
          featureCounts is part of the subRead package, which is written in either java or C++ (I'd have to look). We script this in python as well. Since it's a regular executable it's not a problem to do.

          Comment


          • #6
            Originally posted by dpryan View Post
            featureCounts is part of the subRead package, which is written in either java or C++ (I'd have to look). We script this in python as well. Since it's a regular executable it's not a problem to do.
            It is written in C (not C++ or java).

            Comment


            • #7
              featureCounts works great. Thanks for the tip.

              Comment


              • #8
                Originally posted by shi View Post
                It is written in C (not C++ or java).
                @Wei: Any update on adding support for SAM v.1.4 tags to featureCounts?

                Comment


                • #9
                  Originally posted by shi View Post
                  It is written in C (not C++ or java).
                  I knew I should have looked this up... :P

                  Comment


                  • #10
                    Originally posted by GenoMax View Post
                    @Wei: Any update on adding support for SAM v.1.4 tags to featureCounts?
                    Hi @GenoMax, could you be more specific on what tags you'd like featureCounts to support?

                    Comment

                    Latest Articles

                    Collapse

                    • seqadmin
                      Current Approaches to Protein Sequencing
                      by seqadmin


                      Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                      04-04-2024, 04:25 PM
                    • seqadmin
                      Strategies for Sequencing Challenging Samples
                      by seqadmin


                      Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                      03-22-2024, 06:39 AM

                    ad_right_rmr

                    Collapse

                    News

                    Collapse

                    Topics Statistics Last Post
                    Started by seqadmin, 04-11-2024, 12:08 PM
                    0 responses
                    18 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 04-10-2024, 10:19 PM
                    0 responses
                    22 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 04-10-2024, 09:21 AM
                    0 responses
                    17 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 04-04-2024, 09:00 AM
                    0 responses
                    49 views
                    0 likes
                    Last Post seqadmin  
                    Working...
                    X