Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to count number of mapped paired-end and single-end rna-seq reads

    Does any one know know, how to count number of mapped paired-end and single-end rna-seq reads using BAM files.
    It seems samtools idx stats does not give exactly mapped reads information ? Any suggestion will be appreciated!

  • #2
    Try using samtools flagstat.

    Comment


    • #3
      that gives the no.of mapped loci but not mapped reads.

      Comment


      • #4
        It generates a summary of reads based on the SAM FLAG in column 2 of the BAM file:

        4255310402 + 0 in total (QC-passed reads + QC-failed reads)
        0 + 0 duplicates
        4252238423 + 0 mapped (99.93%:nan%)
        4255310402 + 0 paired in sequencing
        2102851470 + 0 read1
        2152458932 + 0 read2
        362406042 + 0 properly paired (8.52%:nan%)
        4217472878 + 0 with itself and mate mapped
        34765545 + 0 singletons (0.82%:nan%)
        3616654841 + 0 with mate mapped to a different chr
        3273787 + 0 with mate mapped to a different chr (mapQ>=5)

        Comment


        • #5
          99.93% mapping ? I think it is not referring 99.93% of your reads are mapped. 100% mapping is not possible or at least too good be true.

          Comment


          • #6
            Yes, 99.93% read mapping, although it doesn't include the quality of the mapping. You'll have to look that up in the BAM file independently.
            Last edited by rdeborja; 01-05-2013, 03:42 PM.

            Comment


            • #7
              If you look at any published studies (2010-12), you will typically see 80-90% but not ~100%. What thats tells ? Tophat always reports 100%. Something wrong isn't it ?

              Comment


              • #8
                Originally posted by repinementer View Post
                If you look at any published studies (2010-12), you will typically see 80-90% but not ~100%. What thats tells ? Tophat always reports 100%. Something wrong isn't it ?
                There's nothing wrong there. If I remember correctly Tophat produces bam files containing only the mapped reads (accepted_hits.bam). The unmapped reads are written to a separate file I think. That's the reason why the bam files have 100% mapped reads (in fact it shoud be 100% not ~99%).

                Dario

                Comment


                • #9
                  I find it helpful to use bam_stat.py from RSeQC or Picard's CollectAlignmentSummaryMetrics to get the number of reads that mapped one or more times (which you don't get from flagstat)

                  Comment

                  Latest Articles

                  Collapse

                  • seqadmin
                    Current Approaches to Protein Sequencing
                    by seqadmin


                    Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                    04-04-2024, 04:25 PM
                  • seqadmin
                    Strategies for Sequencing Challenging Samples
                    by seqadmin


                    Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                    03-22-2024, 06:39 AM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by seqadmin, 04-11-2024, 12:08 PM
                  0 responses
                  18 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 04-10-2024, 10:19 PM
                  0 responses
                  22 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 04-10-2024, 09:21 AM
                  0 responses
                  16 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 04-04-2024, 09:00 AM
                  0 responses
                  47 views
                  0 likes
                  Last Post seqadmin  
                  Working...
                  X