Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • BWA aln _ option for number of mismatch ???

    Hi,
    What is the option which allows to select the number of mismatch accepted in a read? Please, anybody could help me by giving an example ?
    Regards,
    Sam64

  • #2
    Maybe you can only do it in the seeded alignment.


    % bwa aln

    Usage: bwa aln [options] <prefix> <in.fq>

    Options: -n NUM max #diff (int) or missing prob under 0.02 err rate (float) [0.04]
    -o INT maximum number or fraction of gap opens [1]
    -e INT maximum number of gap extensions, -1 for disabling long gaps [-1]
    -i INT do not put an indel within INT bp towards the ends [5]
    -d INT maximum occurrences for extending a long deletion [10]
    -l INT seed length [32]
    -k INT maximum differences in the seed [2]
    -m INT maximum entries in the queue [2000000]
    -t INT number of threads [1]
    -M INT mismatch penalty [3]
    -O INT gap open penalty [11]
    -E INT gap extension penalty [4]
    -R INT stop searching when there are >INT equally best hits [30]
    -q INT quality threshold for read trimming down to 35bp [0]
    -c input sequences are in the color space
    -L log-scaled gap penalty for long deletions
    -N non-iterative mode: search for all n-difference hits (slooow)
    -f FILE file to write output to instead of stdout

    The -n option maybe what you want here - if you set it to an integer.

    Also:

    -k is for the number of mis-matches in the seed alignment and -l is for the length of the seed alignment, - so maybe you can change -l to the length of your read and -k to whatever you want. Otherwise try lowering -M to get less of a penalty for mis-matches.

    Chris

    Comment


    • #3
      Thank you for your post
      It helped me a lot

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Current Approaches to Protein Sequencing
        by seqadmin


        Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
        04-04-2024, 04:25 PM
      • seqadmin
        Strategies for Sequencing Challenging Samples
        by seqadmin


        Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
        03-22-2024, 06:39 AM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, 04-11-2024, 12:08 PM
      0 responses
      27 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-10-2024, 10:19 PM
      0 responses
      31 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-10-2024, 09:21 AM
      0 responses
      26 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-04-2024, 09:00 AM
      0 responses
      52 views
      0 likes
      Last Post seqadmin  
      Working...
      X