Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • dan
    wiki wiki
    • Jul 2008
    • 194

    FASTQ alignment metrics (RNA-Seq)?

    Hello,

    How do people judge the quality of a FASTQ (short read) alignment? In particular I'm interested in evaluating RNA-Seq alignments, typically (but not exclusively) from ILLUMINA instruments.

    What comes to mind is:
    * Fraction of reads mapped
    * Fraction of reads mapped uniquely
    * Fraction of 'good' pairs (right orientation, right distance)

    and for RNA-Seq specifically
    * Fraction of reads mapping within a gene

    Anything based on read mapping quality?

    What other metrics can we think of?
    Homepage: Dan Bolser
    MetaBase the database of biological databases.
  • annaprotasio
    Junior Member
    • Feb 2008
    • 6

    #2
    hi Dan,

    Have a look at "samtools flagstat"

    The output will looks something like this and I think it contains all the info you requested.

    Code:
    7276199 + 0 in total (QC-passed reads + QC-failed reads)
    0 + 0 duplicates
    7276199 + 0 mapped (100.00%:-nan%)
    7276199 + 0 paired in sequencing
    3787000 + 0 read1
    3489199 + 0 read2
    6195536 + 0 properly paired (85.15%:-nan%)
    6795026 + 0 with itself and mate mapped
    481173 + 0 singletons (6.61%:-nan%)
    480036 + 0 with mate mapped to a different chr
    480036 + 0 with mate mapped to a different chr (mapQ>=5)
    good luck

    Comment

    • GenoMax
      Senior Member
      • Feb 2008
      • 7142

      #3
      Also take a look at RSeQC: http://rseqc.sourceforge.net/

      Most aligners will produce stats on alignments e.g. BBMap, TopHat and probably STAR as well.

      Comment

      • maxsalm
        Member
        • Feb 2015
        • 18

        #4
        FastQC may also be of general use: http://www.bioinformatics.babraham.a...ojects/fastqc/

        Comment

        • dan
          wiki wiki
          • Jul 2008
          • 194

          #5
          Originally posted by maxsalm View Post
          I agree it's useful, but it's not what I want here.
          Homepage: Dan Bolser
          MetaBase the database of biological databases.

          Comment

          • jwfoley
            Senior Member
            • Jun 2009
            • 183

            #6
            How about proportion of duplicate fragments? This will depend on whether you've done single- or paired-end reads, though, since with single RNA-seq reads you do expect a certain amount of duplication by chance (with paired reads it's a much smaller chance).

            Comment

            • bjackson
              Junior Member
              • May 2015
              • 6

              #7
              I do primarily single ended reads, but for alignment quality I look primarily at
              1) pct of reads mapped
              2) pct of reads uniquely mapped

              It sounds like you are also asking about post-alignment qc in general and I add
              3) read duplication (ie how many reads align to identical location) - most reads should have only one or several.
              4) reads biotype distribution (most should map to protein-coding regions)
              5) cumulative pct measures - I sort genes by count or fpkm and graph # of genes vs cumulative percentage. That will tell you if you are sinking a lot of reads into very common transcripts and tell you that you might need more depth to see certain less common transcripts.

              Comment

              Latest Articles

              Collapse

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by SEQadmin2, Yesterday, 10:09 AM
              0 responses
              10 views
              0 reactions
              Last Post SEQadmin2  
              Started by SEQadmin2, 06-04-2026, 08:59 AM
              0 responses
              20 views
              0 reactions
              Last Post SEQadmin2  
              Started by SEQadmin2, 06-02-2026, 12:03 PM
              0 responses
              27 views
              0 reactions
              Last Post SEQadmin2  
              Started by SEQadmin2, 06-02-2026, 11:40 AM
              0 responses
              21 views
              0 reactions
              Last Post SEQadmin2  
              Working...