Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • BWA paired end mapping quality

    I used BWA to map my PE sequencing data to reference genome. I try to use paired mapping quality to filter bad read pairs out for downstream analysis.
    How BWA calculate paired mapping quality? I understand it calculates single-end mapping quality like MAQ does. But I am not sure how it proceeds after having the single mapping quality for both ends? Simply add up or something more complicated? I’ve checked the source code, but the program does not make much sense without a good understanding of the variable names/notations. FYI, the relevant source code is located in the ‘static int pairing’ function of the bwape.c file.
    I would really appreciate your input.
    pparg

  • #2
    Hello, does anybody have any ideas on this? Thank you!

    Comment


    • #3
      Hi all, I'm interested too! Could someone post a link or a brief description of BWA quality mapping scoring ?

      Thanks in advance.

      Comment


      • #4
        +1, I have the exact same question, too

        +1, I have the exact same question, too

        I'd also like to know how the mapping quality for paired end reads is computed, is it just the sum of the quality of the two separate reads?

        Comment


        • #5
          Unfortunately, the best documentation is from the original paper (single end) as well as the code (paired end). Try modifying the code to print out the relevant variables to understand the calculation etc.

          Comment


          • #6
            Hey I'm interested in this too. In particular, what if one read maps to one location on the reference, but the the other read maps to somewhere differently (such that it does not have the correct orientation and/or distance)? What I really want to know if such pairs are down weighted by low mapping quality in some way?

            Comment


            • #7
              It says in the paper that BWA will find all single-end alignments for each mate and sort them in ascending order of chromosomal coordinates. Then it uses an estimated insert size to determine which of the chromosomal coordinates are best for both mates.

              The insert size is determined in the function infer_isize, and I believe the pairing is determined in the function pairing :-) both are contained in bwape.c.

              Comment


              • #8
                Hello All,

                I have a WholeExome paired end sample and I reached the step where I am performing the alignment to human genome (hg19.fa) on a 10 node cluster.

                I am running the command:
                bwa aln hg19.fa sample1_1.fastq > sample1_1.sai
                bwa aln hg19.fa sample1_2.fastq > sample1_2.sai

                But it's taking forever. I understand this could due to couple of reasons, main reason being that I am not doing any pre-filtering. I saw that packages like GenomeQuest do lot of pre-filtering which can make the alignment faster.

                I am total new-bie and i am wondering if I can get help here regarding how and what kind of pre-filtering can I run with this sample before using bwa for alignment. I am kind of in a hurry to get some results so any result will be extremely appreciated.

                Thanks,
                angel

                Comment


                • #9
                  I've run bwa on exome capture DNA with no filtering at all. And takes a while, but it doesn't take forever, and every minute or so it updates the screen telling me how many more reads its finished processing.

                  Using multiple processors with the -t option would certainly speed things along, if your computer has that capacity.

                  Comment


                  • #10
                    Thanks swbarnes2 very much for your reply.

                    I hope my files will finish by tomorrow. The size of one paired-end fastq file in my case is 63GB.

                    I will try the multi-threading mode you mentioned tomorrow.

                    Angel

                    Comment

                    Latest Articles

                    Collapse

                    • seqadmin
                      Exploring the Dynamics of the Tumor Microenvironment
                      by seqadmin




                      The complexity of cancer is clearly demonstrated in the diverse ecosystem of the tumor microenvironment (TME). The TME is made up of numerous cell types and its development begins with the changes that happen during oncogenesis. “Genomic mutations, copy number changes, epigenetic alterations, and alternative gene expression occur to varying degrees within the affected tumor cells,” explained Andrea O’Hara, Ph.D., Strategic Technical Specialist at Azenta. “As...
                      07-08-2024, 03:19 PM
                    • seqadmin
                      Exploring Human Diversity Through Large-Scale Omics
                      by seqadmin


                      In 2003, researchers from the Human Genome Project (HGP) announced the most comprehensive genome to date1. Although the genome wasn’t fully completed until nearly 20 years later2, numerous large-scale projects, such as the International HapMap Project and 1000 Genomes Project, continued the HGP's work, capturing extensive variation and genomic diversity within humans. Recently, newer initiatives have significantly increased in scale and expanded beyond genomics, offering a more detailed...
                      06-25-2024, 06:43 AM

                    ad_right_rmr

                    Collapse

                    News

                    Collapse

                    Topics Statistics Last Post
                    Started by seqadmin, Yesterday, 06:53 AM
                    0 responses
                    12 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 07-10-2024, 07:30 AM
                    0 responses
                    34 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 07-03-2024, 09:45 AM
                    0 responses
                    204 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 07-03-2024, 08:54 AM
                    0 responses
                    213 views
                    0 likes
                    Last Post seqadmin  
                    Working...
                    X