Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • juan
    Member
    • Aug 2009
    • 14

    Split Read RNA-Seq mapping with bowtie?

    How does Bowtie handle RNA-seq data, has anyone tried? Can it map split reads? Any plans to add split-read mapping functionality?

    Comment

    • ewilbanks
      Member
      • Mar 2009
      • 83

      Originally posted by Ben Langmead View Post
      Hi Lizzy,

      I'd expect, oh, about 7-8 hours or so. Did it finish?

      Thanks,
      Ben
      Hi Ben,

      It did! Total run time was 9 hrs 22 min. Thanks!

      Lizzy

      Comment

      • ewilbanks
        Member
        • Mar 2009
        • 83

        Originally posted by Ben Langmead View Post
        Hi Layla,

        h_sapiens indexes the NCBI human reference contigs and h_sapiens_asm indexes the NCBI human reference assembly. Take a look at the scripts/make_h_sapiens.sh and scripts/make_h_sapiens_asm.sh files distributed with Bowtie to see exactly what fasta files were indexed and how.

        People often prefer the assembly because the coordinates output by bowtie are more immediately useful (e.g., they correspond to the hg18 coordinates in the Genome Browser).

        Thanks,
        Ben
        Hi Ben,

        Thanks for this, I was confused about this as well. Might be a useful tidbit to put near the downloads on your website? Thanks for all the support. Bowtie rocks!

        Lizzy

        Comment

        • Ben Langmead
          Senior Member
          • Sep 2008
          • 200

          Originally posted by juan View Post
          How does Bowtie handle RNA-seq data, has anyone tried? Can it map split reads? Any plans to add split-read mapping functionality?
          Hi Juan,

          Check out TopHat (linked to from the sidebar of the Bowtie website). TopHat was written by Cole Trapnell and it implements a layer on top of Bowtie that handles spliced alignments, along with several other aspects of alignment to the transcriptome and calling junctions.

          Hope that helps.

          Thanks,
          Ben

          Comment

          • forrest
            Junior Member
            • Sep 2009
            • 2

            Hi Ben,
            Can Bowtie do DNA methylation aligment? I haven't found in your manual. How to define the parameter?

            Thank you!

            Comment

            • para_seq
              Member
              • Aug 2009
              • 12

              Setting: Number of 'N's allowed in the reads

              Hi, Ben,

              I ran Bowtie for reads from Illumina sequencing. It is a very good pieces of software for its high speed and relatively small memory usage. I have two related questions. 1. I wonder if there is a switch in Bowtie to filter out reads that contain more than certain number of 'N' s. 2. How many 'N's in each read (e.g. 35-nt long) do people usually allow for Illumina data? Thank you.

              Comment

              • Ben Langmead
                Senior Member
                • Sep 2008
                • 200

                Originally posted by forrest View Post
                Can Bowtie do DNA methylation aligment? I haven't found in your manual. How to define the parameter?
                There are ad hoc ways of using Bowtie to do DNA methylation alignment, e.g., you can index and query the strands of your target genome separately and change the bases to mimic the bisulfite reaction (Cs -> Ts, for the most part). Refinements to this scheme are also possible (e.g. treat lone Cs and CpGs differently when mimicking the bisulfite reaction). As usual, the two main problems are loss of signal (more mismatches -> fewer reads align) and signal bias (depending on the conversion scheme, there might be an inherent coverage bias toward methylated or toward unmethylated sites).

                I am also looking at trying to support bisulfite more directly in the future.

                Thanks,
                Ben

                Comment

                • SillyPoint
                  Member
                  • May 2008
                  • 39

                  Circular genomes

                  Does Bowtie make any provision for circular genomes?

                  I could fake it out by appending N-1 bases from the start of the reference onto the end of it, where N is the read length. But I'd need a separate index for each value of N.

                  Or perhaps not. If I use a large N, some reads will map both to the real position near the start, and to the same sequence in the appended copy. Running with --best and -k 1 (the default), I should get a single report at one of those positions.

                  But it would be a lot cleaner just to have a --circular option.

                  --TS

                  Comment

                  • Ben Langmead
                    Senior Member
                    • Sep 2008
                    • 200

                    Originally posted by SillyPoint View Post
                    But it would be a lot cleaner just to have a --circular option.
                    You're correct that there is no --circular option. Your workaround is good, though.

                    Thanks,
                    Ben

                    Comment

                    • Ben Langmead
                      Senior Member
                      • Sep 2008
                      • 200

                      Originally posted by para_seq View Post
                      1. I wonder if there is a switch in Bowtie to filter out reads that contain more than certain number of 'N' s.
                      In effect, that's what the -v/-n settings do. Ns count as mismatches, so if the alignment policy is "-v 2", reads with more than 2 Ns will not align. If Bowtie's options don't do exactly what you need, you could also write a very simple script that filters out reads according to your own criteria beforehand.

                      Originally posted by para_seq View Post
                      2. How many 'N's in each read (e.g. 35-nt long) do people usually allow for Illumina data? Thank you.
                      That depends very much on the quality of the data. Some sets of reads have many, many Ns, and some have Ns systematically at certain positions. My advice is to try various parameters and see what seems to give the best result.

                      Thanks,
                      Ben

                      Comment

                      • para_seq
                        Member
                        • Aug 2009
                        • 12

                        Originally posted by Ben Langmead View Post
                        In effect, that's what the -v/-n settings do. Ns count as mismatches, so if the alignment policy is "-v 2", reads with more than 2 Ns will not align. If Bowtie's options don't do exactly what you need, you could also write a very simple script that filters out reads according to your own criteria beforehand.



                        That depends very much on the quality of the data. Some sets of reads have many, many Ns, and some have Ns systematically at certain positions. My advice is to try various parameters and see what seems to give the best result.

                        Thanks,
                        Ben
                        Thank very much, Ben.

                        Comment

                        • lix
                          Member
                          • Sep 2009
                          • 17

                          Hi Ben,

                          Thank you for Bowtie. It seems to be widely used and I am running it recently. There are some option settings I'm confused and I hope you can give me some kind suggestions:

                          I am mapping the ChIP-seq reads(raw sequences) to the reference genome(homo sapiens) and I want to the length of reads to be 25bp, with 2 mismatches to be allowed. So, I choose the option " --best -r -3". Is this option setting correct for mapping? If not, how the option should be setted?
                          I also build a reference genome index using "bowtie-build" option and is there any difference using this compared with the pre-built indexes on the website?

                          Thanks,

                          -lix

                          Comment

                          • Ben Langmead
                            Senior Member
                            • Sep 2008
                            • 200

                            Hi lix,

                            Originally posted by lix View Post
                            I am mapping the ChIP-seq reads(raw sequences) to the reference genome(homo sapiens) and I want to the length of reads to be 25bp, with 2 mismatches to be allowed. So, I choose the option " --best -r -3". Is this option setting correct for mapping?
                            For raw reads, yes, use "-r". For allowing up to 2 mismatches, use "-v 2". It's also reasonable to use "--best"; it's noticeably slower than the default mode, but you might prefer having the best-ness guarantee.

                            Originally posted by lix View Post
                            I also build a reference genome index using "bowtie-build" option and is there any difference using this compared with the pre-built indexes on the website?
                            No difference. If you build an index using any of the scripts in the "scripts" subdirectory of the bowtie package, you should end up with an index that's effectively identical to the corresponding pre-built index. All pre-built indexes are built with bowtie-build.

                            Thanks,
                            Ben

                            Comment

                            • lix
                              Member
                              • Sep 2009
                              • 17

                              Ben, thanks for your so kind reply. I'm also wondering whether the "-3" option for the length of reads setting is correct. For example, if the length of my reads is 36bp and I want to the length just to be 25bp, so I set "-3 11" . Is that correct?

                              Thanks again,
                              lix

                              Comment

                              • Ben Langmead
                                Senior Member
                                • Sep 2008
                                • 200

                                Originally posted by lix View Post
                                Ben, thanks for your so kind reply. I'm also wondering whether the "-3" option for the length of reads setting is correct. For example, if the length of my reads is 36bp and I want to the length just to be 25bp, so I set "-3 11" . Is that correct?
                                Yes, "-3 11" will turn an 36bp read into a 25bp read with 11 bases trimmed from the (3') end. To trim from the front (5') end, use "-5".

                                Thanks,
                                Ben

                                Comment

                                Latest Articles

                                Collapse

                                • SEQadmin2
                                  From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                                  by SEQadmin2


                                  Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                                  The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                                  ...
                                  06-02-2026, 10:05 AM
                                • SEQadmin2
                                  Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
                                  by SEQadmin2


                                  With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


                                  Introduction

                                  Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
                                  05-22-2026, 06:42 AM
                                • SEQadmin2
                                  Environmental Genomics in the Age of NGS: From Microbes to Conservation Strategies
                                  by SEQadmin2

                                  Studying ecosystems means dealing with complex, multi-species communities that are hard to observe at scale. This complexity, however, hides many important questions to be answered, from how biogeochemical cycles work and how climate change can affect species distribution to how conservation strategies can work best.


                                  Genomics, particularly since the expansion of NGS, has transformed ecosystem ecology. By sequencing environmental DNA, we can now assess biodiversity without direct...
                                  05-06-2026, 09:04 AM

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by SEQadmin2, 06-02-2026, 12:03 PM
                                0 responses
                                19 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, 06-02-2026, 11:40 AM
                                0 responses
                                14 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, 05-28-2026, 11:40 AM
                                0 responses
                                29 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, 05-26-2026, 10:12 AM
                                0 responses
                                31 views
                                0 reactions
                                Last Post SEQadmin2  
                                Working...