Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Brian Bushnell
    replied
    Originally posted by blancha View Post
    If you check the official SAM format specification, you'll see that M is for alignment match, and "can be a sequence match or mismatch". 125 bases aligned, but there still can be mismatches, in this case 3.

    At least, that is my understanding of the convoluted SAM format.
    Yep, that's correct. But the most recent SAM specification reports mismatches in the cigar string, as well. You can see this by mapping with BBMap, which uses the 'X' and '=' symbols.

    Leave a comment:


  • yinshe
    replied
    Thanks! This is more clear! It seems 'M' and 'X','=' giving some redundant information.

    Leave a comment:


  • blancha
    replied
    But both alignments says 125 base pair matching (CIGAR), so there is no base differences. It seems the SAM record gives different information? Or something I understand wrong?
    If you check the official SAM format specification, you'll see that M is for alignment match, and "can be a sequence match or mismatch". 125 bases aligned, but there still can be mismatches, in this case 3.


    At least, that is my understanding of the convoluted SAM format.
    Attached Files
    Last edited by blancha; 10-29-2015, 11:06 AM.

    Leave a comment:


  • yinshe
    replied
    Hi blancha,

    Thanks for the explanation! But both alignments says 125 base pair matching (CIGAR), so there is no base differences. It seems the SAM record gives different information? Or something I understand wrong?

    Leave a comment:


  • blancha
    replied
    I can't find anywhere a formal definition for the meaning of MAPQ set to 0 by BWA.
    There are only forum posts saying that a MAPQ set to 0 means that a read has multiple hits.

    In your example, the second alignment has the NM tag set to 3, meaning the edit distance to the reference (number of nucleotide differences) is 3.
    The NM tag is set to 1 in the first alignment.

    One could surmise that the 1st alignment is unique in the sense that the second alignment is of such poor quality that it doesn't count.

    Admittedly, this is just wild speculation.
    There should be a formal definition of MAPQ set to 0 to which aligners should adhere, to make the interpretation of the mapping quality less arduous.

    It is certain that the second alignment is of far lesser quality than the first, so it does make sense that the mapping quality is much lower.

    Leave a comment:


  • yinshe
    replied
    I just put some more detail about this question:

    The fastq file used in the alignment is not a fastq file from sequencer. I sliced HYDIN2 sequence into small pieces, each is 125 bp long. I assigned base quality as 30 ("I") for all bases. So all bases have a high base quality. When I did alignment, I asked bwa to output also secondary alignment (using -a option). The record I mentioned here are as following:

    b38_1:146691684-146691808 16 16 71053369 23 125M * 0 0 AGCTGAAA.... IIIIIIIIIIII.... NM:i:1 MD:Z:88T36 AS:i:120 XS:i:110
    b38_1:146691684-146691808 272 GL000192.1 263206 0 125M * 0 0 * * NM:i:3 MD:Z:5G31G50T36 AS:i:110

    Leave a comment:


  • N311V
    replied
    It is also my understanding that mapping quality in that case would be zero for both.

    Is there an option to randomly keep one of the multiple mappings rather than discard all of them in bwa mem?

    Leave a comment:


  • yinshe
    started a topic bwa mem mapping quality on ambiguous mapping reads

    bwa mem mapping quality on ambiguous mapping reads

    Hello,

    I used bwa mem to align 125bp single end reads to human decoy reference genome. I know bwa will assign mapping quality as zero when one read mapped to two or more locations in the genome. However, I noticed some reads which are mapped equally well to different genomic locations, e.g. one read is mapped to equally well to autosome chromosome (chr16) and one of the patches (GL000192). CIGAR for both alignments are 125M. However, mapping quality for the alignment on chr16 is 23, while the alignment mapped to GL000192 got mapping quality of zero. I thought both of them should have mapping quality as zero? Is this right or not?

    thanks!

Latest Articles

Collapse

  • seqadmin
    Exploring the Dynamics of the Tumor Microenvironment
    by seqadmin




    The complexity of cancer is clearly demonstrated in the diverse ecosystem of the tumor microenvironment (TME). The TME is made up of numerous cell types and its development begins with the changes that happen during oncogenesis. “Genomic mutations, copy number changes, epigenetic alterations, and alternative gene expression occur to varying degrees within the affected tumor cells,” explained Andrea O’Hara, Ph.D., Strategic Technical Specialist at Azenta. “As...
    07-08-2024, 03:19 PM
  • seqadmin
    Exploring Human Diversity Through Large-Scale Omics
    by seqadmin


    In 2003, researchers from the Human Genome Project (HGP) announced the most comprehensive genome to date1. Although the genome wasn’t fully completed until nearly 20 years later2, numerous large-scale projects, such as the International HapMap Project and 1000 Genomes Project, continued the HGP's work, capturing extensive variation and genomic diversity within humans. Recently, newer initiatives have significantly increased in scale and expanded beyond genomics, offering a more detailed...
    06-25-2024, 06:43 AM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, 07-19-2024, 07:20 AM
0 responses
39 views
0 likes
Last Post seqadmin  
Started by seqadmin, 07-16-2024, 05:49 AM
0 responses
50 views
0 likes
Last Post seqadmin  
Started by seqadmin, 07-15-2024, 06:53 AM
0 responses
61 views
0 likes
Last Post seqadmin  
Started by seqadmin, 07-10-2024, 07:30 AM
0 responses
43 views
0 likes
Last Post seqadmin  
Working...
X