Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • Xi Wang
    Senior Member
    • Oct 2009
    • 317

    Hi Ben,

    I am confused how Bowtie deals with the quality scores when counting mismatches.

    I noticed that there are two parameters related to this issue. First, -n/--seedmms <int> indicates the maximum mismatches in seed, meaning that if a hit with greater than the mismatch cutoff it will not be reported by Bowtie. And second, -e/--maqerr <int> indicates the maximum sum of quality scores allowed at the mismatched bases (is it right?). However, I don't know whether the two criteria are the same or complemental.

    Further, the two measurements of mismatches are both counted in seed region. Even though the users can specify the seed length, I am wondering where does the seed locate: from the leftmost of a query (read) or a random region in the query.

    Besides, there is another parameter -v <int>, which takes care the end-to-end mismatches, but does not consider the quality scores. Is it possible to make this consider the quality scores?

    Best regards!
    Xi
    Xi Wang

    Comment

    • lindseyjane
      Member
      • Apr 2009
      • 28

      Question regarding bwt paired end alignment

      I am currently trying to aligned paired end Illumina reads using bowtie and I want to compare the results to those from maq.

      I cannot see an option for reporting an alignment for a read when its mate does not map? Is this possible?

      The maq software still reports alignments for a read even if its mate does not map and I wanted to do the same thing with bowtie. A lot of pairs end up unaligned (significantly more than with maq) if this is not possible.

      If any one knows hows to do this I would really appreciate it, thanks.

      Comment

      • Ben Langmead
        Senior Member
        • Sep 2008
        • 200

        Hi Xi,

        Originally posted by Xi Wang View Post
        I noticed that there are two parameters related to this issue. First, -n/--seedmms <int> indicates the maximum mismatches in seed, meaning that if a hit with greater than the mismatch cutoff it will not be reported by Bowtie. And second, -e/--maqerr <int> indicates the maximum sum of quality scores allowed at the mismatched bases (is it right?). However, I don't know whether the two criteria are the same or complemental.
        They're complementary. If either limit is exceeded, the alignment is invalid.

        Originally posted by Xi Wang View Post
        Further, the two measurements of mismatches are both counted in seed region. Even though the users can specify the seed length, I am wondering where does the seed locate: from the leftmost of a query (read) or a random region in the query.
        From the leftmost end of the read. -e applies to the entire alignment, not just the seed, exactly as in Maq.

        Originally posted by Xi Wang View Post
        Besides, there is another parameter -v <int>, which takes care the end-to-end mismatches, but does not consider the quality scores. Is it possible to make this consider the quality scores?
        No; to consider qualities, use -n/-l/-e.

        Thanks,
        Ben

        Comment

        • Ben Langmead
          Senior Member
          • Sep 2008
          • 200

          Originally posted by lindseyjane View Post
          I cannot see an option for reporting an alignment for a read when its mate does not map? Is this possible?
          Your best bet is to run Bowtie in paired-end mode while using --un to dump unaligned reads to files. Then run again in unpaired mode using the unaligned reads as input.

          Let me know if that doesn't solve your problem.

          Thanks,
          Ben

          Comment

          • Layla
            Member
            • Sep 2008
            • 58

            comparable parameters with maq

            Hi Ben,

            Excellent work with Bowtie - looking forward to cutting down data processing time. Working on a project in which I have used maq, but for subsequent paired end medip-seq of 45 bases I want to use Bowtie and parameters as close to maq as possible.

            Using maq I eliminate reads with a maq quality < 10 (the same read mapped to >1 location and hence ambiguous) and output to another file.
            I also keep only those flags 18 and 130 (correctly paired reads).
            Using ad-hoc script I only keep one hit if the same read is mapped to the same start and stop location multiple times (pcr bias)

            I'd like to create the same criteria using bowtie. Could you advise me? To begin with, the default in bowtie is good - 2MM in 28 base seed region with sum of e 70

            thank you

            Layla

            Comment

            • Xi Wang
              Senior Member
              • Oct 2009
              • 317

              Originally posted by Ben Langmead View Post
              Hi Xi,

              to consider qualities, use -n/-l/-e.
              Thanks, Ben.
              I am still wondering whether the seed region is defined only for counting the mismatches or not. If I want to just use the quality score criterion, and set -l equal to 0, does it work?

              Best wishes,
              Xi
              Xi Wang

              Comment

              • Ben Langmead
                Senior Member
                • Sep 2008
                • 200

                Originally posted by Xi Wang View Post
                I am still wondering whether the seed region is defined only for counting the mismatches or not.
                Yes. The setting for -l matters for the -n limit but not for the -e limit.

                Originally posted by Xi Wang View Post
                If I want to just use the quality score criterion, and set -l equal to 0, does it work?
                No, -l must be set to 5 or greater.

                Ben

                Comment

                • ramouz87
                  Member
                  • Oct 2009
                  • 35

                  Hi,
                  I'm New in the field of NGS (was working mainly on microarray data analysis) and i'm starting to invastigate comon tools related to sequence analysis.
                  I have human data (paired reads/ 75 base) and used Bowtie for the alignment.
                  I used standard parameter for alignment :
                  bowtie -t -p 8 h_sapiens_37_asm ./s_8_1_sequence.fq ./s_8_1_sequence.fq.bowtie.align
                  bowtie -t -p 8 h_sapiens_37_asm ./s_8_2_sequence.fq ./s_8_2_sequence.fq.bowtie.align
                  bowtie -t -p 8 h_sapiens_37_asm -1 ./s_8_1_sequence.fq -2 ./s_8_2_sequence.fq ./s_8_sequence.fq.bowtie.align

                  and I get respectively the following results:
                  # reads processed: 6660511
                  # reads with at least one reported alignment: 4615451 (69.30%)
                  # reads that failed to align: 2045060 (30.70%)
                  # reads with at least one reported alignment: 5050548 (75.83%)
                  # reads that failed to align: 1609963 (24.17%)
                  # reads with at least one reported alignment: 13371 (0.20%)
                  # reads that failed to align: 6647140 (99.80%)

                  The data quality is not optimal but i guess that having no alignment using paired end is not due to that fact and probably parameter should be tuned.
                  Any one could give me some insight about the optimal setting for the paired end alignment ?
                  Thanks in advance,
                  Best,
                  ramzi
                  Research Scientist - Bioinformatics
                  Sidra Medical and Research Center

                  Comment

                  • liu3zhen
                    Junior Member
                    • Sep 2009
                    • 8

                    A question for number of mismatches. I can not set up -v 4. (error: -v arg must be at most 3) Does that mean Bowtie at most allow 3 mismatches for whatever length of reads? Thanks.

                    Comment

                    • liu3zhen
                      Junior Member
                      • Sep 2009
                      • 8

                      Another question:

                      I'm reading the manual for -k -a and --best.

                      I'm confusing about if we put (-k or -a) with --best together. I thought that if a read has several "best" alignments, these "best" should have kinds of "equal" alignment scores. But the manual said that if -k or -a >1 and --best are specified, only best alignments will be reported and they are appear in best-to-worst order, which means that the best alignments are not "equally best".

                      Hopefully get your help soon, thanks.

                      Comment

                      • Ben Langmead
                        Senior Member
                        • Sep 2008
                        • 200

                        Originally posted by ramouz87 View Post
                        The data quality is not optimal but i guess that having no alignment using paired end is not due to that fact and probably parameter should be tuned.
                        Any one could give me some insight about the optimal setting for the paired end alignment ?
                        Thanks in advance,
                        Best,
                        ramzi
                        Hi Ramzi,

                        The options you're looking for are almost certainly -I/-X and --ff/--fr/--rf. You need to have a reasonably good idea of the expected insert size and specify an appropriate range with -I/-X. You should also confirm that your paired-end protocol produces pairs in the fw/rev orientation. This is the typical configuration for Illumina. If your paired-end data has a different orientation, change it with --ff or --rf.

                        Hope that helps,
                        Ben

                        Comment

                        • Ben Langmead
                          Senior Member
                          • Sep 2008
                          • 200

                          Originally posted by liu3zhen View Post
                          A question for number of mismatches. I can not set up -v 4. (error: -v arg must be at most 3) Does that mean Bowtie at most allow 3 mismatches for whatever length of reads? Thanks.
                          Hi liu3zhen,

                          To allow more than 3 mismatches in the alignment, use the Maq-like options: -n/-l/-e instead of -v.

                          Thanks,
                          Ben

                          Comment

                          • ecabot
                            Junior Member
                            • Jul 2008
                            • 6

                            are pairs considered separately wrt mismatches and uniquness with soap-like policy

                            I have a couple of questions about how Bowtie deals with mismatches in a paired end run. (Using -v 1 and -m 1). I have my guesses as to how things work, but I am hoping that someone knowlegeable (e.g. Ben) will ring-in with the correct information.

                            1) Is it possible to obtain an alignment for a read pair where one read uniquely maps but the other doesn't? (my guess: no)

                            2) Does the mismatch setting apply to both reads or are they taken together. In other words if 1 mismatch is specified, can both members of a pair each have 1-mismatch? (my guess: yes)

                            Comment

                            • Ben Langmead
                              Senior Member
                              • Sep 2008
                              • 200

                              Originally posted by liu3zhen View Post
                              But the manual said that if -k or -a >1 and --best are specified, only best alignments will be reported and they are appear in best-to-worst order, which means that the best alignments are not "equally best".
                              That's right; --best does not limit the number of alignments Bowtie reports. If you ask for 1 alignment (default), --best guarantees it's the best. If you ask for -k 4, --best guarantees they're the 4 best, reported in best-to-worst order. If you ask for -a, --best guarantees that you'll get all of them in best-to-worst order.

                              Thanks
                              Ben

                              Comment

                              • Ben Langmead
                                Senior Member
                                • Sep 2008
                                • 200

                                Originally posted by ecabot View Post
                                1) Is it possible to obtain an alignment for a read pair where one read uniquely maps but the other doesn't? (my guess: no)
                                Definitely yes! That's exactly where paired-end sequencing pays off . If either read aligns uniquely, that alignment will be used as an anchor to look for the mate's alignment and, if it's found, that paired-end alignment will be reported.

                                Originally posted by ecabot View Post
                                2) Does the mismatch setting apply to both reads or are they taken together. In other words if 1 mismatch is specified, can both members of a pair each have 1-mismatch? (my guess: yes)
                                The mismatch setting applies to each read. So, yes, if -v 1 is specified, *both* mates are allowed to have a mismatch.

                                Hope that helps,
                                Ben

                                Comment

                                Latest Articles

                                Collapse

                                • SEQadmin2
                                  From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                                  by SEQadmin2


                                  Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                                  The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                                  ...
                                  06-02-2026, 10:05 AM
                                • SEQadmin2
                                  Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
                                  by SEQadmin2


                                  With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


                                  Introduction

                                  Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
                                  05-22-2026, 06:42 AM
                                • SEQadmin2
                                  Environmental Genomics in the Age of NGS: From Microbes to Conservation Strategies
                                  by SEQadmin2

                                  Studying ecosystems means dealing with complex, multi-species communities that are hard to observe at scale. This complexity, however, hides many important questions to be answered, from how biogeochemical cycles work and how climate change can affect species distribution to how conservation strategies can work best.


                                  Genomics, particularly since the expansion of NGS, has transformed ecosystem ecology. By sequencing environmental DNA, we can now assess biodiversity without direct...
                                  05-06-2026, 09:04 AM

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by SEQadmin2, Today, 08:59 AM
                                0 responses
                                3 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, 06-02-2026, 12:03 PM
                                0 responses
                                21 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, 06-02-2026, 11:40 AM
                                0 responses
                                14 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, 05-28-2026, 11:40 AM
                                0 responses
                                29 views
                                0 reactions
                                Last Post SEQadmin2  
                                Working...