Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • Wind
    Junior Member
    • Jul 2009
    • 2

    Bowtie is a nice tool for short read alignment I think. However, I found a problem in pair-end data mapping. I produced 75bp reads by simulating Illumina's high-throughput sequencing, and aligned them to the reference sequence. By the way, only few alignments, less than 10, are reported. As 1300000 alignments are reported with non paired-end mapping, probably it is wrongly mapped I think.
    My option is "bowtie -p 8 -a -y -X 650 human -1 reads_1.fa -2 reads_2.fa output.map".

    Can anybody tell me what is the problem?
    Last edited by Wind; 08-03-2009, 03:35 AM.

    Comment

    • Ben Langmead
      Senior Member
      • Sep 2008
      • 200

      Originally posted by Wind View Post
      Bowtie is a nice tool for short read alignment I think. However, I found a problem in pair-end data mapping. I produced 75bp reads by simulating Illumina's high-throughput sequencing, and aligned them to the reference sequence. By the way, only few alignments, less than 10, are reported. As 1300000 alignments are reported with non paired-end mapping, probably it is wrongly mapped I think.
      My option is "bowtie -p 8 -a -y -X 650 human -1 reads_1.fa -2 reads_2.fa output.map".
      This is probably due to the -I/--minins, -X/--maxins, and/or --fr/--rf/--ff options being set incorrectly. Please double-check the manual's description of those options and verify that your invocation matches the way you've simulated your reads. Also, make sure the simulated read files are formatted correctly, with all mates lining up properly.

      Thanks,
      Ben

      Comment

      • Wind
        Junior Member
        • Jul 2009
        • 2

        Thanks

        Hi Ben,

        Thanks for your advice. There were many 'N's in simulated data, so that they may interrupt paired-mapping. I'll try with other data sets. Thanks.

        Comment

        • tianell
          Junior Member
          • Aug 2009
          • 1

          Ben, help me..

          Hi Ben,
          I have a question for you about alignment result message.
          When I align certain short reads to reference using Bowtie, can I get a result message related to none-matched case??

          I could not find an option to get a such result message.

          I want to report even if certain short reads are not aligned to reference in order to use this information(not aligned!).

          I wil wait your answer, Ben. Thank you so much.

          Comment

          • Ben Langmead
            Senior Member
            • Sep 2008
            • 200

            Hi tianell,

            Originally posted by tianell View Post
            When I align certain short reads to reference using Bowtie, can I get a result message related to none-matched case??

            I could not find an option to get a such result message.

            I want to report even if certain short reads are not aligned to reference in order to use this information(not aligned!).
            Sorry, no, there is no option to print such a message. I'll add this as a feature request. In the meantime, it's quite easy to deduce that number either by using the --un/--max options (and then counting), or by subtracting the reported number from the number of input reads.

            Thanks,
            Ben

            Comment

            • joa_ds
              Member
              • Dec 2008
              • 52

              Isn't there a feature to export unmapped reads to a file?

              I always run bowtie and export unmapped and repeats using

              --unfq unaligned.fastq --maxfa duplicates.fastq

              taking a look at the size of both files compared to your original file gives you an approx idea of % unaligned/repeats

              Comment

              • bioinfosm
                Senior Member
                • Jan 2008
                • 483

                I wanted to discuss a use-case:
                A collection of 172 million reads ranging from 36 to 76 base long was used with bowtie to map to a reference.

                $ ./bowtie --best --un leftover -p 4 -t reference reads mapped
                $ grep -c '^@' leftover
                154828705
                $ wc -l mapped
                16269083 mapped

                The total of leftover and mapped is less than what we started with. Are the remaining reads mapping to multiple locations, and thus omitted in both these files?
                --
                bioinfosm

                Comment

                • Ben Langmead
                  Senior Member
                  • Sep 2008
                  • 200

                  Hi boinfosm,

                  Originally posted by bioinfosm View Post
                  The total of leftover and mapped is less than what we started with. Are the remaining reads mapping to multiple locations, and thus omitted in both these files?
                  That shouldn't be the case. When only --un is used (as opposed to both --un and --max), both the unaligned reads and the reads with a number of alignments exceeding the -m limit will go into the --un file. But you're not using the -m option, so no reads should be suppressed due to multiple alignments.

                  How are you counting the number of reads in your input set? Note that grep -c '^@' isn't necessarily correct because quality strings can also start with @.

                  Thanks,
                  Ben

                  Comment

                  • bioinfosm
                    Senior Member
                    • Jan 2008
                    • 483

                    thanks Ben.. the light bulb just flashed on me!
                    --
                    bioinfosm

                    Comment

                    • davisc
                      Member
                      • Oct 2008
                      • 14

                      Question about RepeatMasked hg18 index

                      I'm doing RNA-Seq on human samples. In many instances I am mapping using the -m1 -v2 --best criteria to the preassembled hg18.asm index available on the download site. I would like to know how Bowtie handles N's in the indices? I am wondering if it is possible to cut down the mapping time by building and mapping against a repeatmasked version of the genome?

                      Comment

                      • Ben Langmead
                        Senior Member
                        • Sep 2008
                        • 200

                        Originally posted by davisc View Post
                        I would like to know how Bowtie handles N's in the indices? I am wondering if it is possible to cut down the mapping time by building and mapping against a repeatmasked version of the genome?
                        When Bowtie indexes the reference, it elides non-A/C/G/T characters. So if you index a reference with stretches of Ns, Bowtie will never report an alignment spanning any of the stretches.

                        And yes, mapping against the repeatmasked version of the genome (and omitting -m 1) ought to be noticeably faster.

                        Ben

                        Comment

                        • ewilbanks
                          Member
                          • Mar 2009
                          • 83

                          Indexing human genome?

                          Hi!

                          I'm working on building an index of human genome locally and I was wondering how long this usually takes? Its been running for about 3 hrs, just wondering what to expect. I'm on a MAC dual core with 4GB ram.

                          Thanks!
                          Lizzy

                          Comment

                          • Ben Langmead
                            Senior Member
                            • Sep 2008
                            • 200

                            Hi Lizzy,

                            I'd expect, oh, about 7-8 hours or so. Did it finish?

                            Thanks,
                            Ben

                            Comment

                            • Layla
                              Member
                              • Sep 2008
                              • 58

                              Im a newbie to Bowtie....tired of the counting down the hours using MAQ.

                              Currently building an index using Bowtie. What is the difference between
                              h_sapiens_asm.ebwt.zip and
                              h_sapiens.ebwt.zip

                              Thanks

                              L

                              Comment

                              • Ben Langmead
                                Senior Member
                                • Sep 2008
                                • 200

                                Hi Layla,

                                h_sapiens indexes the NCBI human reference contigs and h_sapiens_asm indexes the NCBI human reference assembly. Take a look at the scripts/make_h_sapiens.sh and scripts/make_h_sapiens_asm.sh files distributed with Bowtie to see exactly what fasta files were indexed and how.

                                People often prefer the assembly because the coordinates output by bowtie are more immediately useful (e.g., they correspond to the hg18 coordinates in the Genome Browser).

                                Thanks,
                                Ben

                                Comment

                                Latest Articles

                                Collapse

                                • SEQadmin2
                                  From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                                  by SEQadmin2


                                  Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                                  The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                                  ...
                                  Yesterday, 10:05 AM
                                • SEQadmin2
                                  Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
                                  by SEQadmin2


                                  With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


                                  Introduction

                                  Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
                                  05-22-2026, 06:42 AM
                                • SEQadmin2
                                  Environmental Genomics in the Age of NGS: From Microbes to Conservation Strategies
                                  by SEQadmin2

                                  Studying ecosystems means dealing with complex, multi-species communities that are hard to observe at scale. This complexity, however, hides many important questions to be answered, from how biogeochemical cycles work and how climate change can affect species distribution to how conservation strategies can work best.


                                  Genomics, particularly since the expansion of NGS, has transformed ecosystem ecology. By sequencing environmental DNA, we can now assess biodiversity without direct...
                                  05-06-2026, 09:04 AM

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by SEQadmin2, Yesterday, 12:03 PM
                                0 responses
                                19 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, Yesterday, 11:40 AM
                                0 responses
                                14 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, 05-28-2026, 11:40 AM
                                0 responses
                                29 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, 05-26-2026, 10:12 AM
                                0 responses
                                31 views
                                0 reactions
                                Last Post SEQadmin2  
                                Working...