Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • ieuanclay
    Member
    • Feb 2009
    • 27

    #76
    Yes - great thank you!

    Ieuan

    Comment

    • ieuanclay
      Member
      • Feb 2009
      • 27

      #77
      Sorry to keep on about this, I just want to get it clear.

      By default:
      -k is 1, so only one (the best according -n-v-l-e) alignment is reported.
      -m is unlimited

      so if a read has multiple valid alignments, one of which is better than the others (fewer mismatches, though the others are still valid), and i specify -k 1 -m 1, will the best alignment be given, or will it be pumped into --maxfa ?

      If I am worried about this sort of situation, should i specify --best?

      Comment

      • Ben Langmead
        Senior Member
        • Sep 2008
        • 200

        #78
        so if a read has multiple valid alignments, one of which is better than the others (fewer mismatches, though the others are still valid), and i specify -k 1 -m 1, will the best alignment be given, or will it be pumped into --maxfa ?
        In that situation, no alignments will be printed and the read will go into the --maxfa/--maxfq file(s).

        If I am worried about this sort of situation, should i specify --best?
        That won't help in this case because --best doesn't change which alignments are considered valid; rather, it changes which valid alignments are reported by Bowtie. The -v/-n/-l/-e options are the only ones that change which alignments are considered valid by Bowtie. If the set of valid alignments happens to be stratified (e.g., there's an exact hit and a bunch of 1-mismatch hits), the existence of the better alignments doesn't invalidate the worse ones.

        If this poses a problem, I'd be interested to hear more about what you're looking for...

        Thanks,
        Ben

        Comment

        • ieuanclay
          Member
          • Feb 2009
          • 27

          #79
          Hi Ben,

          Thanks for your help so far - I am relatively new to mapping (but not so new that I am not impressed by bowtie!), so please excuse any dopey questions...

          My worry is that I will lose alignments which have perfect alignments if I also have other near matches. I guess I can make -n smaller (1 say). But I suppose the caveat is that if there is an exact match and a near match, you cannot say which is the correct one - sequencing error etc... - and so it is conservative to reject the read as having multiple alignments?

          What I work on it is really important to be very sure about where the reads map... so maybe it would be good to keep -n at 1 and be more confident about the reads? I don't want to have to refer back to alignment confidences in analyses later on, but say that beyond a certain confidence threshold I am happy with them all. If I am going to reduce -n, should I also reduce -l to 20 or 25?

          I guess what I really want to know is: how stringent are the default alignment settings? Can I make them more stringent without losing a lot of 'true' (but imperfect) alignments?

          Thanks again,

          Ieuan

          Comment

          • Ben Langmead
            Senior Member
            • Sep 2008
            • 200

            #80
            My worry is that I will lose alignments which have perfect alignments if I also have other near matches. I guess I can make -n smaller (1 say). But I suppose the caveat is that if there is an exact match and a near match, you cannot say which is the correct one - sequencing error etc... - and so it is conservative to reject the read as having multiple alignments?
            I see your concern. In your case, you may want to consider running Bowtie with -a --nostrata (or -k <some int> --nostrata) and then postprocessing the results in whatever way you think is appropriate for your application. If you'd like to reject reads on the basis of the number of alignments found in the *best* match stratum (as opposed to all strata), you can do that with a script.

            Another alternative is to do multiple Bowtie runs with decreasingly stringent alignment policies (e.g. -n 0, then -n 1, etc). The input to each run might the the --unfq reads from the run before.

            I guess what I really want to know is: how stringent are the default alignment settings? Can I make them more stringent without losing a lot of 'true' (but imperfect) alignments?
            The default alignment policy is -n 2 -l 28 -e 70, which mimics Maq's defaults (with the caveat that Maq actually lets through some alignments with 3 mismatches in the seed). Whether you can make the policy more stringent without losing true alignments depends on how different your query organism is from the reference. Intuitively, the default policy has no problem finding alignments where there are 2 SNPs very close together, but might have a problem finding alignments where there are 3 SNPs very close together. The same goes for -n 1 and 1 SNP vs. 2 SNPs. It's up to you to determine how well those policies fit your problem.

            Hope that helps,
            Ben

            Comment

            • What_Da_Seq
              Member
              • Jul 2008
              • 28

              #81
              How does Bowtie handle ambiguous bases in the refgenome

              Does anybody have experience in preparing a Bowtie search index where certain bases have been modified with ambiguous bases like "Y" which stands for "C" or "T" and if so will these locations be called matches or missmatches if the to be aligned Solexa read has either a "C" or a "T" at that position.

              Thanks

              Comment

              • Ben Langmead
                Senior Member
                • Sep 2008
                • 200

                #82
                The Bowtie indexing step elides stretches of ambiguous bases in the reference. As a result, alignments that overlap an ambiguous base in the reference are never considered "valid" by Bowtie and will not be reported.

                This is explained in a couple of paragraphs in the manual that are new as of 0.9.9.1:

                A result of Bowtie's indexing strategy is that alignments involving one or more ambiguous reference characters (N, -, R Y, etc.) are considered invalid by Bowtie, regardless of the alignment policy. This is true only for ambiguous characters in the reference; alignments involving ambiguous characters in the read are legal, subject to the alignment policy.

                Also, alignments that "fall off" the reference sequence are not considered legal by Bowtie, though some such alignments will become legal once gapped alignment is implemented.

                Comment

                • What_Da_Seq
                  Member
                  • Jul 2008
                  • 28

                  #83
                  Thanks Ben. I could not identify an option for "bowtie-build" that is geared towards maximum efficiency (not speed nor memory efficiency) in generating alignments (least amount of non-aligned reads) in the Bowtie alignment.
                  Your help is appreciated.

                  Thanks

                  Comment

                  • Ben Langmead
                    Senior Member
                    • Sep 2008
                    • 200

                    #84
                    Yes, all bowtie-build options are identical in terms of the index's ability to generate alignments (except those that have slight, non-specifics effect like --ntoa or --oldpmap).

                    Comment

                    • tniranj1
                      Junior Member
                      • Mar 2009
                      • 4

                      #85
                      Help with installation

                      I'm new to next-gen sequencing and have started playing around with different alignment tools for data that will soon be coming in to my lab. From what I've heard, Bowtie sounds perfect, and I appreciate the speedy feedback that's been made available to the community.
                      I do have a slight installation problem. I get the following error during "Make".

                      SeqAn-1.1/seqan/basic/basic_generated_forwards.h:507: error: parse error before numeric constant
                      SeqAn-1.1/seqan/basic/basic_generated_forwards.h:761: confused by earlier errors, bailing out
                      make: *** [bowtie-build] Error 1

                      I installed the platform-independent version on my Mac (OS 10.3.9... yes, it's old I know, we're upgrading soon). Appreciate any help with resolving this.

                      -TiN

                      Comment

                      • Ben Langmead
                        Senior Member
                        • Sep 2008
                        • 200

                        #86
                        What version of g++ do you have (try 'g++ -v') and what version of Bowtie are you trying to compile? Is there another g++ version installed besides the default? I'm not familiar with 10.3, but you can try running g++3 and g++4 and see if either of those work.

                        Thanks,
                        Ben

                        Comment

                        • tniranj1
                          Junior Member
                          • Mar 2009
                          • 4

                          #87
                          I'm using gcc version 3.3 with bowtie 0.9.9.1. Do I need version 4 or higher for g++ in order for installation of bowtie to work, or is 3 sufficient?
                          Thanks,
                          -TiN

                          Comment

                          • Ben Langmead
                            Senior Member
                            • Sep 2008
                            • 200

                            #88
                            Well, the oldest g++ I've used is 3.4.6, which works without warnings. I just tried 3.2.3 and got a bunch of warnings and errors; mostly in the SeqAn headers. So, yes, if you happen to have a newer g++ version somewhere on your machine then please try that. E.g., try typing g++ then hitting tab to see if there's something called g++4 or g++34 or similar. If there is something called g++34, for example, then make bowtie using 'make GCC_SUFFIX=34'. Let me know if that doesn't work; I can try to fix this in a future version of Bowtie.

                            Thanks,
                            Ben

                            Comment

                            • tniranj1
                              Junior Member
                              • Mar 2009
                              • 4

                              #89
                              I just installed gcc 3.4.6 and changed the etc/profile $PATH to reflect the update. When I ran make again, significantly more SeqAn-1.1 errors popped up (too much to post). There is no suffix to the new g++ file. Should I shoot for gcc4.x or would it be more appropriate to wait until our Leopard computer comes in... I would prefer to start testing with this computer now, though.
                              Really appreciate the help!
                              -TiN

                              Comment

                              • Ben Langmead
                                Senior Member
                                • Sep 2008
                                • 200

                                #90
                                Darn! Sorry to waste your time.

                                I can testify that the gcc4 and gcc346 versions on Leopard (from the developer's tools) work fine for me, as do the various gcc4 versions I've tried on Linux. I'm sorry that that 3.4.6 doesn't seem to be working under 10.3. I will add it to my TODO list to address some of that problematic SeqAn code before the next release. In the meantime, since 3.4.6 didn't work, waiting for your Leopard computer is the option that has the least chance of wasting more of your time.

                                Ben

                                Comment

                                Latest Articles

                                Collapse

                                • SEQadmin2
                                  From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                                  by SEQadmin2


                                  Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                                  The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                                  ...
                                  06-02-2026, 10:05 AM
                                • SEQadmin2
                                  Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
                                  by SEQadmin2


                                  With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


                                  Introduction

                                  Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
                                  05-22-2026, 06:42 AM
                                • SEQadmin2
                                  Environmental Genomics in the Age of NGS: From Microbes to Conservation Strategies
                                  by SEQadmin2

                                  Studying ecosystems means dealing with complex, multi-species communities that are hard to observe at scale. This complexity, however, hides many important questions to be answered, from how biogeochemical cycles work and how climate change can affect species distribution to how conservation strategies can work best.


                                  Genomics, particularly since the expansion of NGS, has transformed ecosystem ecology. By sequencing environmental DNA, we can now assess biodiversity without direct...
                                  05-06-2026, 09:04 AM

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by SEQadmin2, Today, 08:59 AM
                                0 responses
                                11 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, 06-02-2026, 12:03 PM
                                0 responses
                                21 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, 06-02-2026, 11:40 AM
                                0 responses
                                17 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, 05-28-2026, 11:40 AM
                                0 responses
                                31 views
                                0 reactions
                                Last Post SEQadmin2  
                                Working...