Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • edge
    replied
    Thanks, dpryan.

    I will have a look on the document that you shared.
    Really appreciate

    Leave a comment:


  • dpryan
    replied
    Originally posted by edge View Post
    Hi,

    Do you know what is the meaning of 100M from bowtie sam output?
    I run bowtie with single-read.
    My input file is 100 read length.

    Thanks.
    It's generally better to start a new thread rather than to resurrect a really old one. Nevertheless, the "100M" is part of the CIGAR string, which is defined in the SAM specification. It means 100 matches (practically, this just means that there were no indels). I recommend familiarising yourself with the SAM format if you're going to do much with sequencing data.

    Leave a comment:


  • edge
    replied
    Hi.

    Thanks for answer about 255 by bowtie sam output.

    Hi,

    Do you know what is the meaning of 100M from bowtie sam output?
    I run bowtie with single-read.
    My input file is 100 read length.

    Thanks.

    Leave a comment:


  • edge
    replied
    Hi,

    Do you know what is the meaning of 100M from bowtie sam output?
    I run bowtie with single-read.
    My input file is 100 read length.

    Thanks.

    Leave a comment:


  • loodramon
    replied
    Hi again,

    I should mention that I used the latest version of bowtie for this alignment.

    I've been told by a colleague that: Bowtie doesn't calculate mapping quality values, so it prints 255 to the MAPQ field of the sam file if the read aligns or 0 otherwise.

    Leave a comment:


  • loodramon
    replied
    Hi,

    I am seeing the same thing with my single stranded (non-paired) RNA-seq alignments.

    There is either a Mapping Quality score of 255 and has a bit flag of 16 or else the read is not mapped and has a bit flag of 4.

    Is this common to all non-paired RNA-seq data? I'm new to RNA-seq.

    Thanks in advance

    Leave a comment:


  • swbarnes2
    replied
    Originally posted by burt View Post
    If I were to look only are reads that are mappable, all of them are reported to have 50M perfect matching.
    That's not what the CIGAR score means. 50M only means no indels, no hard or soft clipping at the ends. There are other elements later in the sam line that indicate what the discrepancies exist between the read and the reference.

    Leave a comment:


  • feixue1039
    replied
    Hi

    I encountered the same issue as burt. My input data is single-end reads sequenced by Illumina Hiseq 2000 platform. I just wonder whether a read with a MAPQ value of 255 is filtered or not before the calculation of FPKM value in downstream analyses (for instance, cufflinks/cuffdiff)?

    Any reply will be appreciated.

    feixue1039

    Leave a comment:


  • burt
    replied
    Hi, thanks for the reply. My input data is a set of single-end reads. I'm pretty sure if a problem with the sam output in colorspace.

    Just something to add on to the problem description. If I were to look only are reads that are mappable, all of them are reported to have 50M perfect matching. This is very unusual as it's highly unlikely for all reads to map perfectly to the reference genome, especially when we are investigating the transcriptome landscape of the cell.

    Leave a comment:


  • Joker!sAce
    replied
    Field MAPQ considers pairing in calculation if the read is paired. If such a calculation is difficult, 255 is applied, indicating the mapping quality is not available. I assume your query sequences had quality scores with them(FastQ files). Maybe your query sequence was not paired? There is a high order of probability that this is not an error and you should not worry about it, but confirm this.
    Last edited by Joker!sAce; 04-11-2011, 03:45 AM.

    Leave a comment:


  • me_myself_andI
    replied
    Related and up to now also unanswered post: http://seqanswers.com/forums/showthread.php?t=10624

    Leave a comment:


  • burt
    started a topic Error in bowtie's sam output?

    Error in bowtie's sam output?

    Hi,

    I'm not sure if its a bug with the bowtie's sam output, but all my mapping quality would take the value of 255 while the bit flag would take on values of either 0,4,16. I had run bowtie in colorspace.

    Has anyone else encountered a similar issue?

    Here's a sample of the sam output with the qwerky values.

    1279_33_430_F3 4 * 0 0 * * 0 0 TGGTGACCTGGGCCCTGAGGANCGCGTTGATCTACCTCCGCTCAATTGT IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII XM:i:0
    1279_33_409_F3 16 chr6 49186545 255 50M * 0 0 TATTCGTACTGAAAATCAAGATCAAGCGAGCTTTTGCCCTTCTGCTCCAC Iqqqqqqqqqqqqqqqqqqqqqqqqqq!Iq!!qqqqqqqqqqqqqqqqqI XA:i:2 MD:Z:50 NM:i:0 CM:i:2
    1279_34_783_F3 0 chr16 11144009 255 50M * 0 0 TATGTGCTTGGCTGAGGAGCCAATGGGGCGAAGCTACCATCTGTGGGATT Iqqqqqqqqqqqqqqqqqqqq!Iqqqqqqqqqqqqqqqqqqqqqqq!!qI XA:i:2 MD:Z:50 NM:i:0 CM:i:2
    1279_40_121_F3 0 chr17 39983712 255 50M * 0 0 ACGGGGAATCAGGGTTCGATTCCGGAGAGGGAGCCTGAGAAACGGCTACC Iqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq!!qqq!!qqqqI XA:i:2 MD:Z:50 NM:i:0 CM:i:2
    1279_41_567_F3 16 chr6 49186546 255 50M * 0 0 ATTCGTACTGAAAATCAAGATCAAGCGAGCTTTTGCCCTTCTGCTCCACG IqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqI XA:i:0 MD:Z:50 NM:i:0 CM:i:0


    -burt

Latest Articles

Collapse

  • seqadmin
    Essential Discoveries and Tools in Epitranscriptomics
    by seqadmin


    The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist on Modified Bases...
    Today, 07:01 AM
  • seqadmin
    Current Approaches to Protein Sequencing
    by seqadmin


    Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
    04-04-2024, 04:25 PM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, 04-11-2024, 12:08 PM
0 responses
36 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-10-2024, 10:19 PM
0 responses
39 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-10-2024, 09:21 AM
0 responses
34 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-04-2024, 09:00 AM
0 responses
54 views
0 likes
Last Post seqadmin  
Working...
X