Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • BWA-SW / 454 / software options

    Hi,

    I need to map ~ 400 000 454 reads onto a reference genome. The mean length is 310 bp. The reads contain repetitions as well as the genome. The goal is to obtain variations (SNP and Indels).

    I would like to tune the bwasw algorithm using some of the options proposed by the software:

    bwasw bwa bwasw [-a matchScore] [-b mmPen] [-q gapOpenPen] [-r gapExtPen] [-t nThreads] [-w bandWidth] [-T thres] [-s hspIntv] [-z zBest] [-N nHspRev] [-c thresCoef] <in.db.fasta> <in.fq>

    OPTIONS:
    -a INT Score of a match [1]
    -b INT Mismatch penalty [3]
    -q INT Gap open penalty [5]
    -r INT Gap extension penalty. The penalty for a contiguous gap of size k is q+k*r. [2]
    -t INT Number of threads in the multi-threading mode [1]
    -w INT Band width in the banded alignment [33]
    -T INT Minimum score threshold divided by a [37]
    -c FLOAT Coefficient for threshold adjustment according to query length. Given an l-long query, the threshold for a hit to be retained is a*max{T,c*log(l)}. [5.5]
    -z INT Z-best heuristics. Higher -z increases accuracy at the cost of speed. [1]
    -s INT Maximum SA interval size for initiating a seed. Higher -s increases accuracy at the cost of speed. [3]
    -N INT Minimum number of seeds supporting the resultant alignment to skip reverse alignment. [5]

    But I do not know what options to use and what values to put in the options.

    Does anybody have experience with a similar project? In that case, what parameters did you apply?

    What would be a minimum score to apply?

    Thanks in advance.

    Best regards,

    Sabrina.

  • #2
    use the default

    Comment


    • #3
      Originally posted by lh3 View Post
      use the default
      And with that you win the price for most concise and to-the-point answer of this month...

      Comment


      • #4
        Actually I should have said more (so I cannot claim that price). BWA, especially BWA-SW, is designed in such a way that the default works well with the majority of typical input. BWA-SW automatically adjusts its mapping strategy based on the input. You can see from its paper that for simulated reads ranging from 100 to 10,000bp and error rate from 2% to 10%, only the default is used.

        Comment


        • #5
          Originally posted by lh3 View Post
          Actually I should have said more (so I cannot claim that price). BWA, especially BWA-SW, is designed in such a way that the default works well with the majority of typical input. BWA-SW automatically adjusts its mapping strategy based on the input. You can see from its paper that for simulated reads ranging from 100 to 10,000bp and error rate from 2% to 10%, only the default is used.
          I can confirm that it works quite well with the default on various read lengths. Great job Heng!

          Comment


          • #6
            BWA-SW / 454 / multiple hits

            Does anybody know how to get BWA-SW report multiple hits? It is not listed among the options offered by bwa bwasw. Thanks a lot.

            Comment


            • #7
              Sorry. BWA-SW cannot output multiple hits. Partly this is why it is fast.

              Comment


              • #8
                by multiple hits, do we mean equally good multiple mappings of a read, or best, second-best and so on multiple hits of a read.. I thought BWA can do the former with XA tag!
                --
                bioinfosm

                Comment


                • #9
                  Originally posted by lh3 View Post
                  Sorry. BWA-SW cannot output multiple hits. Partly this is why it is fast.
                  It seems that bwaswdoes output multi-hits,or I misunderstand what you said.
                  I'm mapping 454 reads with BWA-SW and find many multi-hits alignments in SAM file.Here is an example:
                  Code:
                  F1GKWGA02HLEN2	16	chr6	170249370	159	230S31M1I5M4D115M20S	*	0	0	cgtacggaacgaacttactacgactacctaccacacacncaccacacacncncacacacacacacacactccacacgacacacacacncncacacacacacacacacncncacacacacacacacactcacacacgacacacacacncncacacacacacacacncgntcgacagncncacagnctcncanacacacanacgtctcactangcacacagctcncgacctagnaccacacagctcacgactgcaccacacagcctcacagnacacacagctcncaactgnaccACACAGCTCACGACAGCACCACACAGCTCACGACAGCACCACACAACTCACAACTGCACCACACAAGCTCACAACAGCGCCACACAGCTCGAGGATCCAGAATTCTCCAG	,,,,0,,,,,,,,,,00030,,,,0,,,3,,,0000,,!,0000059--!-!..96657777997---,-----1---15111993-!-!115------5555=8--!.!..<<<<<<<<<==.-------222---2222295--!-!//66<988899==3-!..!.-.-.7:!8!88=3.!-.-!3-!-28883-!.-.--.-2...!.2589:::87-!,.---733!2115857332:2--///47231559==::22433744!//666777676!666944!444==??44498555ABBBDACCFFFFFFFFFFFFFFFFF===FFFCCCFFFFF:::DFFFFHIIHIBBBIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIFFFFFFFFFFF	AS:i:107	XS:i:40	XF:i:3	XE:i:4	XN:i:0
                  F1GKWGA02HLEN2	16	chr1	202692627	1	77S50M275S	*	0	0	cgtacggaacgaacttactacgactacctaccacacacncaccacacacncncacacacacacacacactccacacgacacacacacncncacacacacacacacacncncacacacacacacacactcacacacgacacacacacncncacacacacacacacncgntcgacagncncacagnctcncanacacacanacgtctcactangcacacagctcncgacctagnaccacacagctcacgactgcaccacacagcctcacagnacacacagctcncaactgnaccACACAGCTCACGACAGCACCACACAGCTCACGACAGCACCACACAACTCACAACTGCACCACACAAGCTCACAACAGCGCCACACAGCTCGAGGATCCAGAATTCTCCAG	,,,,0,,,,,,,,,,00030,,,,0,,,3,,,0000,,!,0000059--!-!..96657777997---,-----1---15111993-!-!115------5555=8--!.!..<<<<<<<<<==.-------222---2222295--!-!//66<988899==3-!..!.-.-.7:!8!88=3.!-.-!3-!-28883-!.-.--.-2...!.2589:::87-!,.---733!2115857332:2--///47231559==::22433744!//666777676!666944!444==??44498555ABBBDACCFFFFFFFFFFFFFFFFF===FFFCCCFFFFF:::DFFFFHIIHIBBBIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIFFFFFFFFFFF	AS:i:42	XS:i:41	XF:i:1	XE:i:1	XN:i:0
                  As I think if a read has multi-location,the Mapping Quality shall be assigned 0.Is that right?

                  Comment


                  • #10
                    These are chimeric hits, each hit corresponding to a different part of the read.

                    Comment


                    • #11
                      What does term XF:i:N mean?

                      As BWA manual shows,
                      XF Support from forward/reverse alignment
                      and from mapping result I can find XF:i:0 to XF:i:3 .
                      Does XF:i:0 means forward/reverse ?And how about others?
                      Thank you!

                      Comment


                      • #12
                        Originally posted by holywoool View Post
                        As BWA manual shows,
                        XF Support from forward/reverse alignment
                        and from mapping result I can find XF:i:0 to XF:i:3 .
                        Does XF:i:0 means forward/reverse ?And how about others?
                        Thank you!
                        The paper describes that the reverse-reverse alignment is not always performed:
                        "In implementation, we do not apply the reverse–reverse alignment if the best alignment contains, by default, 5 or more seeds."

                        Please read the paper carefully, since there are many gems.

                        Comment

                        Latest Articles

                        Collapse

                        • seqadmin
                          Addressing Off-Target Effects in CRISPR Technologies
                          by seqadmin






                          The first FDA-approved CRISPR-based therapy marked the transition of therapeutic gene editing from a dream to reality1. CRISPR technologies have streamlined gene editing, and CRISPR screens have become an important approach for identifying genes involved in disease processes2. This technique introduces targeted mutations across numerous genes, enabling large-scale identification of gene functions, interactions, and pathways3. Identifying the full range...
                          08-27-2024, 04:44 AM
                        • seqadmin
                          Selecting and Optimizing mRNA Library Preparations
                          by seqadmin



                          Sequencing mRNA provides a snapshot of cellular activity, allowing researchers to study the dynamics of cellular processes, compare gene expression across different tissue types, and gain insights into the mechanisms of complex diseases. “mRNA’s central role in the dogma of molecular biology makes it a logical and relevant focus for transcriptomic studies,” stated Sebastian Aguilar Pierlé, Ph.D., Application Development Lead at Inorevia. “One of the major hurdles for...
                          08-07-2024, 12:11 PM

                        ad_right_rmr

                        Collapse

                        News

                        Collapse

                        Topics Statistics Last Post
                        Started by seqadmin, 08-27-2024, 04:40 AM
                        0 responses
                        16 views
                        0 likes
                        Last Post seqadmin  
                        Started by seqadmin, 08-22-2024, 05:00 AM
                        0 responses
                        293 views
                        0 likes
                        Last Post seqadmin  
                        Started by seqadmin, 08-21-2024, 10:49 AM
                        0 responses
                        135 views
                        0 likes
                        Last Post seqadmin  
                        Started by seqadmin, 08-19-2024, 05:12 AM
                        0 responses
                        124 views
                        0 likes
                        Last Post seqadmin  
                        Working...
                        X