Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Bowtie2 YT tag question

    Hello!
    I am currently working with a dataset of Illumina PE reads (length 100bp, fragment 500bp). Now, I know the definition for Bowtie's YT tag and the various values "CP", "UU", "UP", "DP", but I have stumbled upon an example where I do not understand why the pair has been assigned the YT:Z:UP tag.

    Here are the pairs alignment:

    FCC0WRYACXX:4:1115:3531:40304#ATGAGGAA 113 Chr6 3218530 12 100M Chr2 3121587 0 ATATCGCA...AGCTATCA qual AS:i:-5 XS:i:-17 XN:i:0 XM:i:1 XO:i:0 XG:i:0 NM:i:1 MD:Z:9T90 YT:Z:UP

    FCC0WRYACXX:4:1115:3531:40304#ATGAGGAA 177 Chr2 3121587 0 100M Chr6 3218530 0 ACTACTAG...ATCGATAG qual AS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:100 YT:Z:UP

    I would expect these to be considered discordantly mapped (DP) and not UP. Admittedly one of them has a MAPQ of 0, but it is mapped. Should have an '*' in some fields if it was not mapped (and then I would understand the UP.

    Any ideas how to explain this behaviour?

    I tried finding the similar problem mentioned in other threads but couldn't find quite the same situation.
    Cheers

  • #2
    N.B., ignore my earlier reply, it was wrong (thus I deleted it).

    If the reads align to different chromosomes then they're no longer considered discordant, which would normally apply if the insert size was just too big.

    Comment


    • #3
      Thank you for your reply dpryan!

      However, I'm pretty sure mates that have mapped on different chromosomes are tagged with YT:Z: DP
      I cannot check right now, but I will reply again tomorrow with an example.

      As promised, reads that map to different chromosomes are still considered (and I think rightfully so) Discordant Pairs by bowtie. Here's an example, from the same output as the example in my first post:

      FCC0WRYACXX:4:1104:2072:41391#ATGAGGAA 81 Chr1 8014059 42 100M Chr2 4225437 0 CTACTA... C>>DDC... AS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:100 YS:i:0 YT:Z: DP

      FCC0WRYACXX:4:1104:2072:41391#ATGAGGAA 161 Chr2 4225437 42 100M Chr1 8014059 0 GTTGCA... B@BFFF... AS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:100 YS:i:0 YT:Z: DP

      Comment


      • #4
        Oh the many mysteries of bowtie2. I'm guessing that this behavior is described by this comment in the source code (for context, this is from aln_sink.cpp and both concordant and discordant alignments have already been dealt with):

        // If we're at this point, at least one mate failed to align.
        // BTL: That's not true. It could be that there are no concordant
        // alignments but both mates have unpaired alignments, with one of
        // the mates having more than one.
        This still leaves some ambiguity. I could read this to mean that you'll get YT:Z: DP unless there's more than one valid secondary alignment or that you'll get YT:Z: DP unless there's a secondary alignment of equal score to the primary. If you have a chance, look for more examples in your alignments where the mates are on different chromosomes and they have YT:Z: DP. If you tracked their MAPQ scores, then that'd likely clarify this (if they're all 42, then the former reading is likely correct, if not, then presumably the latter is correct).

        Comment


        • #5
          This is getting confusing

          So if I understand this correctly, there shouldn't be pairs with mate mapping on different chromosomes, tagged DP, and MAPQ 0. Right?

          Comment


          • #6
            I'd look for MAPQ 1, actually. You can get a MAPQ of 0 if there's a valid secondary alignment that's still not as good as the primary (yes, this seems backwards).

            Comment


            • #7
              There are no alignments where I have DP, different chromosomes and MAPQ 1.

              However, there are cases where I have DP, different chromosomes and MAPQ 0 for one or both mates, as well as cases where I have UP, different chromosomes and MAPQ 0 for one or both mates.

              Just as a note, in general my output is fairly normal and I have worked already for quite some time on it. But I just noticed that there are some cases where I don't understand how bowtie2 worked exactly.

              PS: Thank you so much for looking into this!

              Comment


              • #8
                That makes sense then.

                0 is a weird case and there are 2 ways that you can get it. The ones leading to UP are the case where the AS:i: and XS:i: tags are the same (i.e., the best and second best alignments are identical), or at least that'd be my expectation. Whether these alignments have MAPQ 0 or 1 depends on how high/low the AS:i: value is and what --score-min is set to. The cases where MAPQ 0 alignments have UP are likely when AS:i: is greater than XS:i:, but AS:i: is still relatively low (the actual algorithm is a bit weird).

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Latest Developments in Precision Medicine
                  by seqadmin



                  Technological advances have led to drastic improvements in the field of precision medicine, enabling more personalized approaches to treatment. This article explores four leading groups that are overcoming many of the challenges of genomic profiling and precision medicine through their innovative platforms and technologies.

                  Somatic Genomics
                  “We have such a tremendous amount of genetic diversity that exists within each of us, and not just between us as individuals,”...
                  05-24-2024, 01:16 PM
                • seqadmin
                  Recent Advances in Sequencing Analysis Tools
                  by seqadmin


                  The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
                  05-06-2024, 07:48 AM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, 05-30-2024, 03:16 PM
                0 responses
                19 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 05-29-2024, 01:32 PM
                0 responses
                20 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 05-24-2024, 07:15 AM
                0 responses
                210 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 05-23-2024, 10:28 AM
                0 responses
                227 views
                0 likes
                Last Post seqadmin  
                Working...
                X