Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • BWA - Why giving only one match per read ?

    I am doing experiments with BWA and BowTie. Now, I am finding alignments using command:


    ./bwa aln -n 0 -k 41 database.fa SRR4493095_1.fastq > aln_sa.sai

    and

    ./bwa samse database .fa aln_sa.sai SRR4493095_1.fastq > out_sa.sam

    However, while BowTie gives 123 number of matches for 100 reads, BWA gives just 100 number of matches (one for each, does not reporting if match in another position found). Can anybody help me why is this happening and how to solve this ? I give -n 0 option because, I want to find matching allowing no mismatch.

    Thanks in advance.

  • #2
    Originally posted by Arupsss View Post
    I am doing experiments with BWA and BowTie.
    Hi, could you please post your bowtie command, as well, for completeness?
    Why did you set -k to 41?
    How long are your reads?

    Comment


    • #3
      Originally posted by sdvie View Post
      Hi, could you please post your bowtie command, as well, for completeness?
      Here it is:

      ./bowtie -a -v 0 database SRR4493095_1.fastq out.txt

      Comment


      • #4
        Are you well aware of the options you are using in bwa?
        I am highlighting the ones that you are using or that you might consider using (from bwa 0.6.1):

        Code:
        bwa aln [options] <prefix> <in.fq>
        Options: [B]-n NUM    max #diff (int) or missing prob under 0.02 err rate (float) [0.04][/B]
                 -o INT    maximum number or fraction of gap opens [1]
                 -e INT    maximum number of gap extensions, -1 for disabling long gaps [-1]
                 -i INT    do not put an indel within INT bp towards the ends [5]
                 -d INT    maximum occurrences for extending a long deletion [10]
                 [B]-l INT    seed length [32][/B]
                 [B]-k INT    maximum differences in the seed [2][/B]
                 -m INT    maximum entries in the queue [2000000]
                 -t INT    number of threads [1]
                 -M INT    mismatch penalty [3]
                 -O INT    gap open penalty [11]
                 -E INT    gap extension penalty [4]
                 -R INT    stop searching when there are >INT equally best hits [30]
                 -q INT    quality threshold for read trimming down to 35bp [0]
                 -f FILE   file to write output to instead of stdout
                 -B INT    length of barcode
                 -c        input sequences are in the color space
                 -L        log-scaled gap penalty for long deletions
                 [B]-N        non-iterative mode: search for all n-difference hits (slooow)[/B]
                 -I        the input is in the Illumina 1.3+ FASTQ-like format
                 -b        the input read file is in the BAM format
                 -0        use single-end reads only (effective with -b)
                 -1        use the 1st read in a pair (effective with -b)
                 -2        use the 2nd read in a pair (effective with -b)
        hope that helps,
        cheers,
        Sophia

        Comment


        • #5
          Thanks. But, from the above,you mean to use N option ? Because, I use only n and k options, not others.

          Comment


          • #6
            Originally posted by Arupsss View Post
            Thanks. But, from the above,you mean to use N option ? Because, I use only n and k options, not others.
            yes.
            And -k indicates the number of mismatches in the seed, and therefore, 41 is an unusual value for that. Why did you set it to 41?

            see here:
            BWA manual

            cheers

            Comment


            • #7
              Originally posted by Arupsss View Post
              I am doing experiments with BWA and BowTie. Now, I am finding alignments using command:


              ./bwa aln -n 0 -k 41 database.fa SRR4493095_1.fastq > aln_sa.sai

              and

              ./bwa samse database .fa aln_sa.sai SRR4493095_1.fastq > out_sa.sam

              However, while BowTie gives 123 number of matches for 100 reads, BWA gives just 100 number of matches (one for each, does not reporting if match in another position found). Can anybody help me why is this happening and how to solve this ? I give -n 0 option because, I want to find matching allowing no mismatch.

              Thanks in advance.
              That's how bwa works. It only returns on position when a read maps multiple times, it just picks one randomly. However, one of the tags in the .sam entry will contain the other positions where the read mapped equally well.

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Advanced Tools Transforming the Field of Cytogenomics
                by seqadmin


                At the intersection of cytogenetics and genomics lies the exciting field of cytogenomics. It focuses on studying chromosomes at a molecular scale, involving techniques that analyze either the whole genome or particular DNA sequences to examine variations in structure and behavior at the chromosomal or subchromosomal level. By integrating cytogenetic techniques with genomic analysis, researchers can effectively investigate chromosomal abnormalities related to diseases, particularly...
                Yesterday, 06:26 AM
              • seqadmin
                How RNA-Seq is Transforming Cancer Studies
                by seqadmin



                Cancer research has been transformed through numerous molecular techniques, with RNA sequencing (RNA-seq) playing a crucial role in understanding the complexity of the disease. Maša Ivin, Ph.D., Scientific Writer at Lexogen, and Yvonne Goepel Ph.D., Product Manager at Lexogen, remarked that “The high-throughput nature of RNA-seq allows for rapid profiling and deep exploration of the transcriptome.” They emphasized its indispensable role in cancer research, aiding in biomarker...
                09-07-2023, 11:15 PM
              • seqadmin
                Methods for Investigating the Transcriptome
                by seqadmin




                Ribonucleic acid (RNA) represents a range of diverse molecules that play a crucial role in many cellular processes. From serving as a protein template to regulating genes, the complex processes involving RNA make it a focal point of study for many scientists. This article will spotlight various methods scientists have developed to investigate different RNA subtypes and the broader transcriptome.

                Whole Transcriptome RNA-seq
                Whole transcriptome sequencing...
                08-31-2023, 11:07 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, Today, 06:57 AM
              0 responses
              6 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, Yesterday, 07:53 AM
              0 responses
              8 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 09-25-2023, 07:42 AM
              0 responses
              14 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 09-22-2023, 09:05 AM
              0 responses
              44 views
              0 likes
              Last Post seqadmin  
              Working...
              X