Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Error in bowtie's sam output?

    Hi,

    I'm not sure if its a bug with the bowtie's sam output, but all my mapping quality would take the value of 255 while the bit flag would take on values of either 0,4,16. I had run bowtie in colorspace.

    Has anyone else encountered a similar issue?

    Here's a sample of the sam output with the qwerky values.

    1279_33_430_F3 4 * 0 0 * * 0 0 TGGTGACCTGGGCCCTGAGGANCGCGTTGATCTACCTCCGCTCAATTGT IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII XM:i:0
    1279_33_409_F3 16 chr6 49186545 255 50M * 0 0 TATTCGTACTGAAAATCAAGATCAAGCGAGCTTTTGCCCTTCTGCTCCAC Iqqqqqqqqqqqqqqqqqqqqqqqqqq!Iq!!qqqqqqqqqqqqqqqqqI XA:i:2 MD:Z:50 NM:i:0 CM:i:2
    1279_34_783_F3 0 chr16 11144009 255 50M * 0 0 TATGTGCTTGGCTGAGGAGCCAATGGGGCGAAGCTACCATCTGTGGGATT Iqqqqqqqqqqqqqqqqqqqq!Iqqqqqqqqqqqqqqqqqqqqqqq!!qI XA:i:2 MD:Z:50 NM:i:0 CM:i:2
    1279_40_121_F3 0 chr17 39983712 255 50M * 0 0 ACGGGGAATCAGGGTTCGATTCCGGAGAGGGAGCCTGAGAAACGGCTACC Iqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq!!qqq!!qqqqI XA:i:2 MD:Z:50 NM:i:0 CM:i:2
    1279_41_567_F3 16 chr6 49186546 255 50M * 0 0 ATTCGTACTGAAAATCAAGATCAAGCGAGCTTTTGCCCTTCTGCTCCACG IqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqI XA:i:0 MD:Z:50 NM:i:0 CM:i:0


    -burt

  • #2
    Related and up to now also unanswered post: http://seqanswers.com/forums/showthread.php?t=10624

    Comment


    • #3
      Field MAPQ considers pairing in calculation if the read is paired. If such a calculation is difficult, 255 is applied, indicating the mapping quality is not available. I assume your query sequences had quality scores with them(FastQ files). Maybe your query sequence was not paired? There is a high order of probability that this is not an error and you should not worry about it, but confirm this.
      Last edited by Joker!sAce; 04-11-2011, 03:45 AM.

      Comment


      • #4
        Hi, thanks for the reply. My input data is a set of single-end reads. I'm pretty sure if a problem with the sam output in colorspace.

        Just something to add on to the problem description. If I were to look only are reads that are mappable, all of them are reported to have 50M perfect matching. This is very unusual as it's highly unlikely for all reads to map perfectly to the reference genome, especially when we are investigating the transcriptome landscape of the cell.

        Comment


        • #5
          Hi

          I encountered the same issue as burt. My input data is single-end reads sequenced by Illumina Hiseq 2000 platform. I just wonder whether a read with a MAPQ value of 255 is filtered or not before the calculation of FPKM value in downstream analyses (for instance, cufflinks/cuffdiff)?

          Any reply will be appreciated.

          feixue1039

          Comment


          • #6
            Originally posted by burt View Post
            If I were to look only are reads that are mappable, all of them are reported to have 50M perfect matching.
            That's not what the CIGAR score means. 50M only means no indels, no hard or soft clipping at the ends. There are other elements later in the sam line that indicate what the discrepancies exist between the read and the reference.

            Comment


            • #7
              Hi,

              I am seeing the same thing with my single stranded (non-paired) RNA-seq alignments.

              There is either a Mapping Quality score of 255 and has a bit flag of 16 or else the read is not mapped and has a bit flag of 4.

              Is this common to all non-paired RNA-seq data? I'm new to RNA-seq.

              Thanks in advance

              Comment


              • #8
                Hi again,

                I should mention that I used the latest version of bowtie for this alignment.

                I've been told by a colleague that: Bowtie doesn't calculate mapping quality values, so it prints 255 to the MAPQ field of the sam file if the read aligns or 0 otherwise.

                Comment


                • #9
                  Hi,

                  Do you know what is the meaning of 100M from bowtie sam output?
                  I run bowtie with single-read.
                  My input file is 100 read length.

                  Thanks.

                  Comment


                  • #10
                    Hi.

                    Thanks for answer about 255 by bowtie sam output.

                    Hi,

                    Do you know what is the meaning of 100M from bowtie sam output?
                    I run bowtie with single-read.
                    My input file is 100 read length.

                    Thanks.

                    Comment


                    • #11
                      Originally posted by edge View Post
                      Hi,

                      Do you know what is the meaning of 100M from bowtie sam output?
                      I run bowtie with single-read.
                      My input file is 100 read length.

                      Thanks.
                      It's generally better to start a new thread rather than to resurrect a really old one. Nevertheless, the "100M" is part of the CIGAR string, which is defined in the SAM specification. It means 100 matches (practically, this just means that there were no indels). I recommend familiarising yourself with the SAM format if you're going to do much with sequencing data.

                      Comment


                      • #12
                        Thanks, dpryan.

                        I will have a look on the document that you shared.
                        Really appreciate

                        Comment

                        Latest Articles

                        Collapse

                        • seqadmin
                          Essential Discoveries and Tools in Epitranscriptomics
                          by seqadmin


                          The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist on Modified Bases...
                          Today, 07:01 AM
                        • seqadmin
                          Current Approaches to Protein Sequencing
                          by seqadmin


                          Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                          04-04-2024, 04:25 PM

                        ad_right_rmr

                        Collapse

                        News

                        Collapse

                        Topics Statistics Last Post
                        Started by seqadmin, 04-11-2024, 12:08 PM
                        0 responses
                        35 views
                        0 likes
                        Last Post seqadmin  
                        Started by seqadmin, 04-10-2024, 10:19 PM
                        0 responses
                        38 views
                        0 likes
                        Last Post seqadmin  
                        Started by seqadmin, 04-10-2024, 09:21 AM
                        0 responses
                        33 views
                        0 likes
                        Last Post seqadmin  
                        Started by seqadmin, 04-04-2024, 09:00 AM
                        0 responses
                        54 views
                        0 likes
                        Last Post seqadmin  
                        Working...
                        X