Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • shi
    Wei Shi
    • Feb 2010
    • 236

    #61
    What Bruce suggested is the current way of getting assignment stats from using featureCounts. But apparently it is not user-friendly. It is on our to-do list to output assignment stats (with no need to turn on -R option) when read summarization is done, but it will take a week or two to implement it. We will make a new release in a day or so, but it is not going to be included in that release.

    Best,
    Wei

    Comment

    • bw.
      Member
      • Mar 2012
      • 21

      #62
      Thank you Wei. I switched to using featureCounts.
      Its especially great that it now takes multiple .bam files.

      Whatever statistics can be provided in future versions will be helpful.

      Comment

      • shi
        Wei Shi
        • Feb 2010
        • 236

        #63
        We have just released Subread 1.4.2. The featureCounts program included in this release outputs an assignment summary file (*.summary) along with the read count file.

        BTW, our featureCounts paper was just published on Bioinformatics. Here is the link to it:



        Wei

        Comment

        • AntonioMaceo
          Junior Member
          • Dec 2013
          • 1

          #64
          SAM/BAM parse error

          Hi Wei,
          I may have found a bug. It seems that the last line of my BAM file header is being parsed as the fist read, since it is quite long this causes the program to exit (line 697 in core.c). When I remove the last @PG line from the header and reheader the bam file your code runs to completion. Please see the attached header.
          cheers,
          Aaron
          Attached Files

          Comment

          • chibouki
            Junior Member
            • Dec 2011
            • 3

            #65
            Hello,
            I think I may have a problem with my gtf file, which look like this:
            ##gff-version 3
            ##source-version geneious 6.1.6
            seq_ref Geneious gene 1 109 . . . Name=exon1;created by=User

            I received this error message:
            Warning: failed to find the gene identifier attribute in the 9th column of the provided GTF file.
            The specified gene identifier attribute is 'exon'
            The attributes included in your GTF annotation are 'Name=exon1;created by=User;modified by=User'

            I hope you could help me

            Comment

            • shi
              Wei Shi
              • Feb 2010
              • 236

              #66
              Originally posted by AntonioMaceo View Post
              Hi Wei,
              I may have found a bug. It seems that the last line of my BAM file header is being parsed as the fist read, since it is quite long this causes the program to exit (line 697 in core.c). When I remove the last @PG line from the header and reheader the bam file your code runs to completion. Please see the attached header.
              cheers,
              Aaron
              Dear Aaron,

              Thanks for reporting this. You are correct that there was a bug in featureCounts in dealing with long header lines. We have fixed this and released a patched version of Subread package (1.4.3-p1).

              Best wishes,
              Wei

              Comment

              • shi
                Wei Shi
                • Feb 2010
                • 236

                #67
                Originally posted by chibouki View Post
                Hello,
                I think I may have a problem with my gtf file, which look like this:
                ##gff-version 3
                ##source-version geneious 6.1.6
                seq_ref Geneious gene 1 109 . . . Name=exon1;created by=User

                I received this error message:
                Warning: failed to find the gene identifier attribute in the 9th column of the provided GTF file.
                The specified gene identifier attribute is 'exon'
                The attributes included in your GTF annotation are 'Name=exon1;created by=User;modified by=User'

                I hope you could help me

                Your GTF file has an incorrect format. Have a look at this page for GTF format:

                Comment

                • chibouki
                  Junior Member
                  • Dec 2011
                  • 3

                  #68
                  Finally, we solve the problem by cutting the last column of sam file.

                  But we are disappointing, it seems that -d and -D options are only for pair-end?

                  Comment

                  • shi
                    Wei Shi
                    • Feb 2010
                    • 236

                    #69
                    I don't understand how you solved the problem by changing the sam file? The problem is with your GTF annotation file, not your sam file.

                    Comment

                    • chibouki
                      Junior Member
                      • Dec 2011
                      • 3

                      #70
                      I know, I stilled receive an error message about the column nine of the gtf (I may have changed it since I posted here for the first time...) but it worked online after cutting the sam

                      Comment

                      • shi
                        Wei Shi
                        • Feb 2010
                        • 236

                        #71
                        Do not change your sam file, otherwise you may get unexpected results. Just change the 9th column of your gtf from for example

                        Name=exon1;created by=User;modified by=User

                        to

                        gene_id "exon1" (space delimited)

                        Comment

                        • sindrle
                          Senior Member
                          • Aug 2013
                          • 266

                          #72
                          Hi!
                          Im trying your package, using UCSC hg19 genes.GTF.

                          Im getting this error:

                          || Load annotation file genes.gtf ... ||
                          || Number of features is 0 ||
                          || WARNING no features were loaded in format GTF. ||
                          || annotation format can be specified using '-F'.

                          This is my code:

                          featureCounts(files=BAMs,file., annot.ext=gtf,isGTFAnnotationFile=TRUE,useMetaFeatures=TRUE,GTF.featureType=featureGENE,GTF.attrType=attributeGENE,nthreads=8, reportReads=TRUE)

                          "BAMs" are a string character with the file names
                          "gtf" is the gene.gtf
                          "attributeGENE" is "gene_id"

                          EDIT:
                          Had to delete this: GTF.attrType=attributeGENE
                          guess it was unnecessary
                          Last edited by sindrle; 01-31-2014, 05:53 PM.

                          Comment

                          • sindrle
                            Senior Member
                            • Aug 2013
                            • 266

                            #73
                            Here is my comparison so far:

                            HS 68,2 56,9 57,2 64,7 59,7 63,8 60,9
                            FC 62.5 56.8 57.1 64.6 59.5 63.6 60.7

                            HTseq was run with "intersection strict", so maybe thats why it counts a little bit more reads I guess.

                            I like that featureCounts is implemented in R, its easy to run
                            On my Macbook the running time is about the same as for HTseq, but maybe its because I had the option "reportReads = TRUE".

                            I also like that it reports gene length and may handle multiple input files, it also names the output according to the input automatically.

                            What will make or break it is how easily its implemented with the DEseq2 workflow.. HTSeq is very easy to use.
                            Last edited by sindrle; 01-31-2014, 06:55 PM.

                            Comment

                            • yangliao
                              Member
                              • May 2013
                              • 13

                              #74
                              Dear Sindrle,

                              Could you please provide a couple of lines in "UCSC hg19 genes.GTF"? There are some GTF or GFF files not in the format that is currently supported by featureCounts.

                              Yang

                              Originally posted by sindrle View Post
                              Hi!
                              Im trying your package, using UCSC hg19 genes.GTF.

                              Im getting this error:

                              || Load annotation file genes.gtf ... ||
                              || Number of features is 0 ||
                              || WARNING no features were loaded in format GTF. ||
                              || annotation format can be specified using '-F'.

                              This is my code:

                              featureCounts(files=BAMs,file., annot.ext=gtf,isGTFAnnotationFile=TRUE,useMetaFeatures=TRUE,GTF.featureType=featureGENE,GTF.attrType=attributeGENE,nthreads=8, reportReads=TRUE)

                              "BAMs" are a string character with the file names
                              "gtf" is the gene.gtf
                              "attributeGENE" is "gene_id"

                              EDIT:
                              Had to delete this: GTF.attrType=attributeGENE
                              guess it was unnecessary

                              Comment

                              • sindrle
                                Senior Member
                                • Aug 2013
                                • 266

                                #75
                                Im a bit disappointed, after finishing read summary after 5.5 hours, R just crashed.. R version 3.0.2 Rsubread version 1.12.6.

                                Maybe I can read the "reported output", the .featureCount files thats 2x the size of the BAMs (!)

                                Heres the GTF:

                                chr1 unknown CDS 69091 70005 . + 0 gene_id "OR4F5"; gene_name "OR4F5"; p_id "P1230"; transcript_id "NM_001005484"; tss_id "TSS14428";
                                chr1 unknown exon 69091 70008 . + . gene_id "OR4F5"; gene_name "OR4F5"; p_id "P1230"; transcript_id "NM_001005484"; tss_id "TSS14428";
                                chr1 unknown start_codon 69091 69093 . + . gene_id "OR4F5"; gene_name "OR4F5"; p_id "P1230"; transcript_id "NM_001005484"; tss_id "TSS14428";
                                chr1 unknown stop_codon 70006 70008 . + . gene_id "OR4F5"; gene_name "OR4F5"; p_id "P1230"; transcript_id "NM_001005484"; tss_id "TSS14428";
                                chr1 unknown exon 134773 139696 . - . gene_id "LOC729737"; gene_name "LOC729737"; transcript_id "NR_039983"; tss_id "TSS18541";
                                chr1 unknown exon 139790 139847 . - . gene_id "LOC729737"; gene_name "LOC729737"; transcript_id "NR_039983"; tss_id "TSS18541";

                                Comment

                                Latest Articles

                                Collapse

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by SEQadmin2, 06-05-2026, 10:09 AM
                                0 responses
                                14 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, 06-04-2026, 08:59 AM
                                0 responses
                                24 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, 06-02-2026, 12:03 PM
                                0 responses
                                28 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, 06-02-2026, 11:40 AM
                                0 responses
                                22 views
                                0 reactions
                                Last Post SEQadmin2  
                                Working...