Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • garlic
    Junior Member
    • Oct 2009
    • 4

    Difficulty using .gtf file in featureCounts read summarization

    Hello,

    I am doing differential expression analysis on an insect bacterial endosymbiont. I have Illumina HiSeq reads that are paired-end and strand-specific.

    I have mapped my reads to a preliminary genome assembly for the endosymbiont with bbmap, and am trying to summarize the results with featureCounts. I annotated the preliminary genome with RAST, and am having difficulty using the exported .gtf file from RAST in featureCounts.

    This is a sample annotation from RAST:

    tig00000001_pilon FIG CDS 734 1288 . + 2 ID=fig|707884.3.peg.1;Name=Mobile element protein

    I am interested in summarizing at the MetaFeature level, since there is really no difference between genes and exons in this case. I used the below command (knowing that the Name field probably wasn't what featureCounts was looking for, since it doesn't represent a unique identifier).

    featureCounts(files=c("file1.bam","file2.bam","file3.bam","file4.bam","file5.bam","file6.bam","file7.bam","file8.bam"),strandSpecific=2,isPairedEnd=TRUE,requireBothEndsMapped=TRUE,isGTFAnnotationFile=TRUE,GTF.attrType="ID",GTF.featureType="Name",useMetaFeatures=TRUE,annot.ext="/path/to/file/file.gtf")

    I got this error:

    || WARNING no features were loaded in format GTF. ||
    || annotation format can be specified using '-F'. ||
    Failed to open the annotation file /path/to/file/file.gtf, or its format is incorrect, or it contains no 'exon' features.
    Error in file(file, "rt") : cannot open the connection
    In addition: Warning message:
    In file(file, "rt") :
    cannot open file './.Rsubread_featureCounts_pid40183': No such file or directory



    I then tried to edit the .gtf file to comply a bit better with the format specifications that I saw described (giving a unique identifier to 'exon', etc.):

    tig00000001_pilon FIG CDS 734 1288 . + 2 gene_id 707884.3.peg.1; exon 707884.3.peg.1.1

    I ran the below command, but I got the exact same error:

    featureCounts(files=c("file1.bam","file2.bam","file3.bam","file4.bam","file5.bam","file6.bam","file7.bam","file8.bam"),strandSpecific=2,isPairedEnd=TRUE,requireBothEndsMapped=TRUE,isGTFAnnotationFile=TRUE,GTF.attrType="gene_id",GTF.featureType="exon",useMetaFeatures=TRUE,annot.ext="/path/to/file/file.gtf")

    Is there something fundamentally wrong that I'm doing here?

    Thanks!
    Ryan
  • GenoMax
    Senior Member
    • Feb 2008
    • 7142

    #2
    At a minimum chromosome identifiers have to perfectly match between your bam files and the GTF files. Check that first.

    Comment

    • garlic
      Junior Member
      • Oct 2009
      • 4

      #3
      Hi GenoMax.

      Yes, the chromosome/contig names are identical between the bam and gtf files. Does featureCounts strictly require another line for each entry that corresponds to a feature, like "exon"?

      Comment

      • GenoMax
        Senior Member
        • Feb 2008
        • 7142

        #4
        I don't think so. Your GTF file is likely not in the correct format. Check that as well.

        Comment

        Latest Articles

        Collapse

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by SEQadmin2, 06-05-2026, 10:09 AM
        0 responses
        12 views
        0 reactions
        Last Post SEQadmin2  
        Started by SEQadmin2, 06-04-2026, 08:59 AM
        0 responses
        24 views
        0 reactions
        Last Post SEQadmin2  
        Started by SEQadmin2, 06-02-2026, 12:03 PM
        0 responses
        28 views
        0 reactions
        Last Post SEQadmin2  
        Started by SEQadmin2, 06-02-2026, 11:40 AM
        0 responses
        22 views
        0 reactions
        Last Post SEQadmin2  
        Working...