  • BWA-SW / 454 / software options


    I need to map ~ 400 000 454 reads onto a reference genome. The mean length is 310 bp. The reads contain repetitions as well as the genome. The goal is to obtain variations (SNP and Indels).

    I would like to tune the bwasw algorithm using some of the options proposed by the software:

    bwasw bwa bwasw [-a matchScore] [-b mmPen] [-q gapOpenPen] [-r gapExtPen] [-t nThreads] [-w bandWidth] [-T thres] [-s hspIntv] [-z zBest] [-N nHspRev] [-c thresCoef] <in.db.fasta> <in.fq>

    -a INT Score of a match [1]
    -b INT Mismatch penalty [3]
    -q INT Gap open penalty [5]
    -r INT Gap extension penalty. The penalty for a contiguous gap of size k is q+k*r. [2]
    -t INT Number of threads in the multi-threading mode [1]
    -w INT Band width in the banded alignment [33]
    -T INT Minimum score threshold divided by a [37]
    -c FLOAT Coefficient for threshold adjustment according to query length. Given an l-long query, the threshold for a hit to be retained is a*max{T,c*log(l)}. [5.5]
    -z INT Z-best heuristics. Higher -z increases accuracy at the cost of speed. [1]
    -s INT Maximum SA interval size for initiating a seed. Higher -s increases accuracy at the cost of speed. [3]
    -N INT Minimum number of seeds supporting the resultant alignment to skip reverse alignment. [5]

    But I do not know what options to use and what values to put in the options.

    Does anybody have experience with a similar project? In that case, what parameters did you apply?

    What would be a minimum score to apply?

    Thanks in advance.

    Best regards,


  • #2
    use the default


    • #3
      Originally posted by lh3 View Post
      use the default
      And with that you win the price for most concise and to-the-point answer of this month...


      • #4
        Actually I should have said more (so I cannot claim that price). BWA, especially BWA-SW, is designed in such a way that the default works well with the majority of typical input. BWA-SW automatically adjusts its mapping strategy based on the input. You can see from its paper that for simulated reads ranging from 100 to 10,000bp and error rate from 2% to 10%, only the default is used.


        • #5
          Originally posted by lh3 View Post
          Actually I should have said more (so I cannot claim that price). BWA, especially BWA-SW, is designed in such a way that the default works well with the majority of typical input. BWA-SW automatically adjusts its mapping strategy based on the input. You can see from its paper that for simulated reads ranging from 100 to 10,000bp and error rate from 2% to 10%, only the default is used.
          I can confirm that it works quite well with the default on various read lengths. Great job Heng!


          • #6
            BWA-SW / 454 / multiple hits

            Does anybody know how to get BWA-SW report multiple hits? It is not listed among the options offered by bwa bwasw. Thanks a lot.


            • #7
              Sorry. BWA-SW cannot output multiple hits. Partly this is why it is fast.


              • #8
                by multiple hits, do we mean equally good multiple mappings of a read, or best, second-best and so on multiple hits of a read.. I thought BWA can do the former with XA tag!


                • #9
                  Originally posted by lh3 View Post
                  Sorry. BWA-SW cannot output multiple hits. Partly this is why it is fast.
                  It seems that bwaswdoes output multi-hits,or I misunderstand what you said.
                  I'm mapping 454 reads with BWA-SW and find many multi-hits alignments in SAM file.Here is an example:
                  F1GKWGA02HLEN2	16	chr6	170249370	159	230S31M1I5M4D115M20S	*	0	0	cgtacggaacgaacttactacgactacctaccacacacncaccacacacncncacacacacacacacactccacacgacacacacacncncacacacacacacacacncncacacacacacacacactcacacacgacacacacacncncacacacacacacacncgntcgacagncncacagnctcncanacacacanacgtctcactangcacacagctcncgacctagnaccacacagctcacgactgcaccacacagcctcacagnacacacagctcncaactgnaccACACAGCTCACGACAGCACCACACAGCTCACGACAGCACCACACAACTCACAACTGCACCACACAAGCTCACAACAGCGCCACACAGCTCGAGGATCCAGAATTCTCCAG	,,,,0,,,,,,,,,,00030,,,,0,,,3,,,0000,,!,0000059--!-!..96657777997---,-----1---15111993-!-!115------5555=8--!.!..<<<<<<<<<==.-------222---2222295--!-!//66<988899==3-!..!.-.-.7:!8!88=3.!-.-!3-!-28883-!.-.--.-2...!.2589:::87-!,.---733!2115857332:2--///47231559==::22433744!//666777676!666944!444==??44498555ABBBDACCFFFFFFFFFFFFFFFFF===FFFCCCFFFFF:::DFFFFHIIHIBBBIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIFFFFFFFFFFF	AS:i:107	XS:i:40	XF:i:3	XE:i:4	XN:i:0
                  F1GKWGA02HLEN2	16	chr1	202692627	1	77S50M275S	*	0	0	cgtacggaacgaacttactacgactacctaccacacacncaccacacacncncacacacacacacacactccacacgacacacacacncncacacacacacacacacncncacacacacacacacactcacacacgacacacacacncncacacacacacacacncgntcgacagncncacagnctcncanacacacanacgtctcactangcacacagctcncgacctagnaccacacagctcacgactgcaccacacagcctcacagnacacacagctcncaactgnaccACACAGCTCACGACAGCACCACACAGCTCACGACAGCACCACACAACTCACAACTGCACCACACAAGCTCACAACAGCGCCACACAGCTCGAGGATCCAGAATTCTCCAG	,,,,0,,,,,,,,,,00030,,,,0,,,3,,,0000,,!,0000059--!-!..96657777997---,-----1---15111993-!-!115------5555=8--!.!..<<<<<<<<<==.-------222---2222295--!-!//66<988899==3-!..!.-.-.7:!8!88=3.!-.-!3-!-28883-!.-.--.-2...!.2589:::87-!,.---733!2115857332:2--///47231559==::22433744!//666777676!666944!444==??44498555ABBBDACCFFFFFFFFFFFFFFFFF===FFFCCCFFFFF:::DFFFFHIIHIBBBIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIFFFFFFFFFFF	AS:i:42	XS:i:41	XF:i:1	XE:i:1	XN:i:0
                  As I think if a read has multi-location,the Mapping Quality shall be assigned 0.Is that right?


                  • #10
                    These are chimeric hits, each hit corresponding to a different part of the read.


                    • #11
                      What does term XF:i:N mean?

                      As BWA manual shows,
                      XF Support from forward/reverse alignment
                      and from mapping result I can find XF:i:0 to XF:i:3 .
                      Does XF:i:0 means forward/reverse ?And how about others?
                      Thank you!


                      • #12
                        Originally posted by holywoool View Post
                        As BWA manual shows,
                        XF Support from forward/reverse alignment
                        and from mapping result I can find XF:i:0 to XF:i:3 .
                        Does XF:i:0 means forward/reverse ?And how about others?
                        Thank you!
                        The paper describes that the reverse-reverse alignment is not always performed:
                        "In implementation, we do not apply the reverse–reverse alignment if the best alignment contains, by default, 5 or more seeds."

                        Please read the paper carefully, since there are many gems.


