No announcement yet.

Subread low percentage of assigned fragments in ENCODE data

  • Filter
  • Time
  • Show
Clear All
new posts

  • Subread low percentage of assigned fragments in ENCODE data

    Hi there guys!

    I am using Subread v1.5.0-p1 to get tables of counts for RNAseq data I got from ENCODE (
    What I am trying to get are counts for genes (using reads assigned to exons) since I want to ultimately turn the numbers into FPKM or TPM.

    However, I am getting around 13% of successfully assigned reads:

    || Total fragments : 72502756 ||
    || Successfully assigned fragments : 9324091 (12.9%) ||
    || Running time : 2.20 minutes ||

    This doesn't sound right to me, I am used to getting 50-75% of reads assigned in RNAseq, using HTseq + DESeq2. I don't expect ENCODE data to be of poor quality so I think it's something on my side.

    I tried all the combinations of fr-firststrand, fr-secondstrand, rf-firststrand, rf-secondstrand etc. Nothing gives me higher values than those 13%.
    Using GENCODE M3 annotation gtf since the data was originally aligned to mm9.
    This is a paired-end library from Illumina GAIIx sequencer.

    Any tips ?

  • #2
    Could you show the content of the ".summary" file which was part of featureCounts output?


    • #3
      I found the problem, I was using wrong GTF annotation, should've used GENCODE mouse M1 since vM3 is built on mm10...