Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • shi
    Wei Shi
    • Feb 2010
    • 236

    #61
    What Bruce suggested is the current way of getting assignment stats from using featureCounts. But apparently it is not user-friendly. It is on our to-do list to output assignment stats (with no need to turn on -R option) when read summarization is done, but it will take a week or two to implement it. We will make a new release in a day or so, but it is not going to be included in that release.

    Best,
    Wei

    Comment

    • bw.
      Member
      • Mar 2012
      • 21

      #62
      Thank you Wei. I switched to using featureCounts.
      Its especially great that it now takes multiple .bam files.

      Whatever statistics can be provided in future versions will be helpful.

      Comment

      • shi
        Wei Shi
        • Feb 2010
        • 236

        #63
        We have just released Subread 1.4.2. The featureCounts program included in this release outputs an assignment summary file (*.summary) along with the read count file.

        BTW, our featureCounts paper was just published on Bioinformatics. Here is the link to it:



        Wei

        Comment

        • AntonioMaceo
          Junior Member
          • Dec 2013
          • 1

          #64
          SAM/BAM parse error

          Hi Wei,
          I may have found a bug. It seems that the last line of my BAM file header is being parsed as the fist read, since it is quite long this causes the program to exit (line 697 in core.c). When I remove the last @PG line from the header and reheader the bam file your code runs to completion. Please see the attached header.
          cheers,
          Aaron
          Attached Files

          Comment

          • chibouki
            Junior Member
            • Dec 2011
            • 3

            #65
            Hello,
            I think I may have a problem with my gtf file, which look like this:
            ##gff-version 3
            ##source-version geneious 6.1.6
            seq_ref Geneious gene 1 109 . . . Name=exon1;created by=User

            I received this error message:
            Warning: failed to find the gene identifier attribute in the 9th column of the provided GTF file.
            The specified gene identifier attribute is 'exon'
            The attributes included in your GTF annotation are 'Name=exon1;created by=User;modified by=User'

            I hope you could help me

            Comment

            • shi
              Wei Shi
              • Feb 2010
              • 236

              #66
              Originally posted by AntonioMaceo View Post
              Hi Wei,
              I may have found a bug. It seems that the last line of my BAM file header is being parsed as the fist read, since it is quite long this causes the program to exit (line 697 in core.c). When I remove the last @PG line from the header and reheader the bam file your code runs to completion. Please see the attached header.
              cheers,
              Aaron
              Dear Aaron,

              Thanks for reporting this. You are correct that there was a bug in featureCounts in dealing with long header lines. We have fixed this and released a patched version of Subread package (1.4.3-p1).

              Best wishes,
              Wei

              Comment

              • shi
                Wei Shi
                • Feb 2010
                • 236

                #67
                Originally posted by chibouki View Post
                Hello,
                I think I may have a problem with my gtf file, which look like this:
                ##gff-version 3
                ##source-version geneious 6.1.6
                seq_ref Geneious gene 1 109 . . . Name=exon1;created by=User

                I received this error message:
                Warning: failed to find the gene identifier attribute in the 9th column of the provided GTF file.
                The specified gene identifier attribute is 'exon'
                The attributes included in your GTF annotation are 'Name=exon1;created by=User;modified by=User'

                I hope you could help me

                Your GTF file has an incorrect format. Have a look at this page for GTF format:

                Comment

                • chibouki
                  Junior Member
                  • Dec 2011
                  • 3

                  #68
                  Finally, we solve the problem by cutting the last column of sam file.

                  But we are disappointing, it seems that -d and -D options are only for pair-end?

                  Comment

                  • shi
                    Wei Shi
                    • Feb 2010
                    • 236

                    #69
                    I don't understand how you solved the problem by changing the sam file? The problem is with your GTF annotation file, not your sam file.

                    Comment

                    • chibouki
                      Junior Member
                      • Dec 2011
                      • 3

                      #70
                      I know, I stilled receive an error message about the column nine of the gtf (I may have changed it since I posted here for the first time...) but it worked online after cutting the sam

                      Comment

                      • shi
                        Wei Shi
                        • Feb 2010
                        • 236

                        #71
                        Do not change your sam file, otherwise you may get unexpected results. Just change the 9th column of your gtf from for example

                        Name=exon1;created by=User;modified by=User

                        to

                        gene_id "exon1" (space delimited)

                        Comment

                        • sindrle
                          Senior Member
                          • Aug 2013
                          • 266

                          #72
                          Hi!
                          Im trying your package, using UCSC hg19 genes.GTF.

                          Im getting this error:

                          || Load annotation file genes.gtf ... ||
                          || Number of features is 0 ||
                          || WARNING no features were loaded in format GTF. ||
                          || annotation format can be specified using '-F'.

                          This is my code:

                          featureCounts(files=BAMs,file., annot.ext=gtf,isGTFAnnotationFile=TRUE,useMetaFeatures=TRUE,GTF.featureType=featureGENE,GTF.attrType=attributeGENE,nthreads=8, reportReads=TRUE)

                          "BAMs" are a string character with the file names
                          "gtf" is the gene.gtf
                          "attributeGENE" is "gene_id"

                          EDIT:
                          Had to delete this: GTF.attrType=attributeGENE
                          guess it was unnecessary
                          Last edited by sindrle; 01-31-2014, 05:53 PM.

                          Comment

                          • sindrle
                            Senior Member
                            • Aug 2013
                            • 266

                            #73
                            Here is my comparison so far:

                            HS 68,2 56,9 57,2 64,7 59,7 63,8 60,9
                            FC 62.5 56.8 57.1 64.6 59.5 63.6 60.7

                            HTseq was run with "intersection strict", so maybe thats why it counts a little bit more reads I guess.

                            I like that featureCounts is implemented in R, its easy to run
                            On my Macbook the running time is about the same as for HTseq, but maybe its because I had the option "reportReads = TRUE".

                            I also like that it reports gene length and may handle multiple input files, it also names the output according to the input automatically.

                            What will make or break it is how easily its implemented with the DEseq2 workflow.. HTSeq is very easy to use.
                            Last edited by sindrle; 01-31-2014, 06:55 PM.

                            Comment

                            • yangliao
                              Member
                              • May 2013
                              • 13

                              #74
                              Dear Sindrle,

                              Could you please provide a couple of lines in "UCSC hg19 genes.GTF"? There are some GTF or GFF files not in the format that is currently supported by featureCounts.

                              Yang

                              Originally posted by sindrle View Post
                              Hi!
                              Im trying your package, using UCSC hg19 genes.GTF.

                              Im getting this error:

                              || Load annotation file genes.gtf ... ||
                              || Number of features is 0 ||
                              || WARNING no features were loaded in format GTF. ||
                              || annotation format can be specified using '-F'.

                              This is my code:

                              featureCounts(files=BAMs,file., annot.ext=gtf,isGTFAnnotationFile=TRUE,useMetaFeatures=TRUE,GTF.featureType=featureGENE,GTF.attrType=attributeGENE,nthreads=8, reportReads=TRUE)

                              "BAMs" are a string character with the file names
                              "gtf" is the gene.gtf
                              "attributeGENE" is "gene_id"

                              EDIT:
                              Had to delete this: GTF.attrType=attributeGENE
                              guess it was unnecessary

                              Comment

                              • sindrle
                                Senior Member
                                • Aug 2013
                                • 266

                                #75
                                Im a bit disappointed, after finishing read summary after 5.5 hours, R just crashed.. R version 3.0.2 Rsubread version 1.12.6.

                                Maybe I can read the "reported output", the .featureCount files thats 2x the size of the BAMs (!)

                                Heres the GTF:

                                chr1 unknown CDS 69091 70005 . + 0 gene_id "OR4F5"; gene_name "OR4F5"; p_id "P1230"; transcript_id "NM_001005484"; tss_id "TSS14428";
                                chr1 unknown exon 69091 70008 . + . gene_id "OR4F5"; gene_name "OR4F5"; p_id "P1230"; transcript_id "NM_001005484"; tss_id "TSS14428";
                                chr1 unknown start_codon 69091 69093 . + . gene_id "OR4F5"; gene_name "OR4F5"; p_id "P1230"; transcript_id "NM_001005484"; tss_id "TSS14428";
                                chr1 unknown stop_codon 70006 70008 . + . gene_id "OR4F5"; gene_name "OR4F5"; p_id "P1230"; transcript_id "NM_001005484"; tss_id "TSS14428";
                                chr1 unknown exon 134773 139696 . - . gene_id "LOC729737"; gene_name "LOC729737"; transcript_id "NR_039983"; tss_id "TSS18541";
                                chr1 unknown exon 139790 139847 . - . gene_id "LOC729737"; gene_name "LOC729737"; transcript_id "NR_039983"; tss_id "TSS18541";

                                Comment

                                Latest Articles

                                Collapse

                                • SEQadmin2
                                  From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                                  by SEQadmin2


                                  Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                                  The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                                  ...
                                  Yesterday, 10:05 AM
                                • SEQadmin2
                                  Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
                                  by SEQadmin2


                                  With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


                                  Introduction

                                  Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
                                  05-22-2026, 06:42 AM
                                • SEQadmin2
                                  Environmental Genomics in the Age of NGS: From Microbes to Conservation Strategies
                                  by SEQadmin2

                                  Studying ecosystems means dealing with complex, multi-species communities that are hard to observe at scale. This complexity, however, hides many important questions to be answered, from how biogeochemical cycles work and how climate change can affect species distribution to how conservation strategies can work best.


                                  Genomics, particularly since the expansion of NGS, has transformed ecosystem ecology. By sequencing environmental DNA, we can now assess biodiversity without direct...
                                  05-06-2026, 09:04 AM

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by SEQadmin2, Yesterday, 12:03 PM
                                0 responses
                                19 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, Yesterday, 11:40 AM
                                0 responses
                                14 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, 05-28-2026, 11:40 AM
                                0 responses
                                29 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, 05-26-2026, 10:12 AM
                                0 responses
                                31 views
                                0 reactions
                                Last Post SEQadmin2  
                                Working...