Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • Arupsss
    Member
    • May 2011
    • 44

    BWA - XM tag Question

    I have installed BWA. And build index of hg19 using command:

    bwa index -a bwtsw hg19.fa

    Now I find alignment using command:

    ./bwa aln hg19.fa SRR4493095_1.fastq > aln_sa.sai

    However, I want to find alignments allowing 1 or 2 mismatches.

    From BWA home page, I found, I have to use XM tag.

    But, can't get how to use that means what should be the command.

    Can anybody please help me on this ?

    Thanks.
    Last edited by Arupsss; 06-28-2012, 06:51 AM.
  • Arupsss
    Member
    • May 2011
    • 44

    #2
    Another point is that, in some places I found - n 0/1/2 this tag gives specified number of mismatch. However, can't understand whether is that true or not. Little bit confusing because some places told to use XM (but don't specify how) some places told to use - n 0/1/2 .

    Comment

    • xied75
      Senior Member
      • Feb 2012
      • 129

      #3
      You misunderstood the XM tag, XM IS a TAG in the output of BWA SAMPE/SAMSE step (which is a SAM file).

      -n IS a switch (or options/parameters) when you run BWA, there are many switches you can give.

      With default values for these (i.e. don't use them at all) you already can have 1 or 2 mismatches.

      Comment

      • Arupsss
        Member
        • May 2011
        • 44

        #4
        Originally posted by xied75 View Post
        You misunderstood the XM tag, XM IS a TAG in the output of BWA SAMPE/SAMSE step (which is a SAM file).

        -n IS a switch (or options/parameters) when you run BWA, there are many switches you can give.

        With default values for these (i.e. don't use them at all) you already can have 1 or 2 mismatches.
        Thanks for reply:

        Actually, I am comparing, BWA vs BowTie. For, BowTie, I can specify output allowing mismatch by command

        ./bowtie --all -v 0 hg19 SRR4930952.fastq SRR4930952.txt (v specify number of mismatch).

        I want same command to compare with BWA ( it's paper say's, it allows).

        So, what command should I give to find exact matches and allowing 1/2 mismatches (I am using 150 bp read).

        Comment

        • xied75
          Senior Member
          • Feb 2012
          • 129

          #5
          -k and/or -n. -k is mismatch in seed, -n is overall, not sure what combination you need with your 150bp read.

          Comment

          • Arupsss
            Member
            • May 2011
            • 44

            #6
            Originally posted by xied75 View Post
            -k and/or -n. -k is mismatch in seed, -n is overall, not sure what combination you need with your 150bp read.
            I have no restrictions in maximum edit distance in the seed.

            Can you inform me: what does this - n option means.

            It says:

            -n NUM Maximum edit distance if the value is INT, or the fraction of missing alignments given 2% uniform base error rate if FLOAT. In the latter case, the maximum edit distance is automatically chosen for different read lengths. [0.04]

            So, if I run in default setting for 150 bps read, it allows 6 mismatches ?

            And for allowing no mismatches:

            I have to set - n 0.

            For 2 mismatches for 150 bps,

            I have to set - n 2.

            Is this correct ?
            Last edited by Arupsss; 06-28-2012, 07:49 AM.

            Comment

            • xied75
              Senior Member
              • Feb 2012
              • 129

              #7
              Yes, -n 0 for all match, -n 2 for 2 mismatches.

              Comment

              • Arupsss
                Member
                • May 2011
                • 44

                #8
                Originally posted by xied75 View Post
                Yes, -n 0 for all match, -n 2 for 2 mismatches.
                Thanks a lot. This I want to make sure. And by default:

                it gives

                [bwa_aln] 17bp reads: max_diff = 2
                [bwa_aln] 38bp reads: max_diff = 3
                [bwa_aln] 64bp reads: max_diff = 4
                [bwa_aln] 93bp reads: max_diff = 5
                [bwa_aln] 124bp reads: max_diff = 6
                [bwa_aln] 157bp reads: max_diff = 7
                [bwa_aln] 190bp reads: max_diff = 8
                [bwa_aln] 225bp reads: max_diff = 9

                mismatches ? True ?

                Comment

                • xied75
                  Senior Member
                  • Feb 2012
                  • 129

                  #9
                  Yes that's the default output you see when there is no -n used.

                  Comment

                  Latest Articles

                  Collapse

                  • SEQadmin2
                    From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                    by SEQadmin2


                    Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                    The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                    ...
                    06-02-2026, 10:05 AM
                  • SEQadmin2
                    Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
                    by SEQadmin2


                    With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


                    Introduction

                    Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
                    05-22-2026, 06:42 AM
                  • SEQadmin2
                    Environmental Genomics in the Age of NGS: From Microbes to Conservation Strategies
                    by SEQadmin2

                    Studying ecosystems means dealing with complex, multi-species communities that are hard to observe at scale. This complexity, however, hides many important questions to be answered, from how biogeochemical cycles work and how climate change can affect species distribution to how conservation strategies can work best.


                    Genomics, particularly since the expansion of NGS, has transformed ecosystem ecology. By sequencing environmental DNA, we can now assess biodiversity without direct...
                    05-06-2026, 09:04 AM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by SEQadmin2, 06-02-2026, 12:03 PM
                  0 responses
                  21 views
                  0 reactions
                  Last Post SEQadmin2  
                  Started by SEQadmin2, 06-02-2026, 11:40 AM
                  0 responses
                  14 views
                  0 reactions
                  Last Post SEQadmin2  
                  Started by SEQadmin2, 05-28-2026, 11:40 AM
                  0 responses
                  29 views
                  0 reactions
                  Last Post SEQadmin2  
                  Started by SEQadmin2, 05-26-2026, 10:12 AM
                  0 responses
                  31 views
                  0 reactions
                  Last Post SEQadmin2  
                  Working...