Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • staylor
    Member
    • Feb 2009
    • 17

    miRNA mapping using BOWTIE

    Hi,

    Can bowtie be used for mapping miRNAs to the genome and if so what is the best parameters to use? I have FASTQ files where I have removed the adapter sequence leaving a 18-23mer.

    Would

    bowtie -l 18 --best --strata

    be appropriate?

    Thanks.
  • whsqwghlm
    Member
    • Jun 2009
    • 14

    #2
    We've been using (to get the top 101 exact matches);
    bowtie -k 101 -v 0

    Our workflow uniquifies the sequences before alignment so we're not concerned about quality values. I'm also guessing that the miRNA sequences are sufficiently conserved for us not to worry about mismatches.

    However, I'm very interested in the views of others on this.

    Comment

    • yjhua2110
      Member
      • Nov 2009
      • 68

      #3
      in our deepBase database, we use options: –k 200 –v 0. the Specifying the parameters (–k 200 –v 0) instructs Bowtie to report up to 200 perfect hits for each read.

      deepBase is a platform for annotating and discovering small and long ncRNAs from next generation sequencing data. It is available at http://deepbase.sysu.edu.cn

      Comment

      • houhuabin
        Member
        • Apr 2009
        • 23

        #4
        Are you looking for this?
        Last edited by houhuabin; 02-02-2010, 07:58 AM.

        Comment

        • whsqwghlm
          Member
          • Jun 2009
          • 14

          #5
          Could well be. However, the link is broken. I would be very grateful if you could fix. Thanks!

          Comment

          • houhuabin
            Member
            • Apr 2009
            • 23

            #6
            Sorry for that, now it is fixed.

            Thanks!
            Last edited by houhuabin; 02-02-2010, 08:03 AM.

            Comment

            • whsqwghlm
              Member
              • Jun 2009
              • 14

              #7
              After a few days of struggling with quality/homeopolymer/adaptor trimming my reads, and reading about 3' RNA edits and so forth, I've decided to try something similar to staylor's original suggestion (similar to the algorithm used by miRanalyzer);

              bowtie -n 0 -l 15 --best

              This should give the best match(es) for an exact 15bp 5' seed. If anyone is interested in a direct comparison between this and the original (-v 0) parameters, or has another view on this, please let me know.

              Comment

              • bioinfosm
                Senior Member
                • Jan 2008
                • 483

                #8
                so what is your post processing? what is the reference sequence? and how do you summarize the data?
                --
                bioinfosm

                Comment

                • whsqwghlm
                  Member
                  • Jun 2009
                  • 14

                  #9
                  In terms of post-processing, We're loading the alignments into an Ensembl database so that we can screen for known genes and repeats. We then predict novel small RNAs, and estimate transcript counts for all loci based on read coverage. It's designed to be a generic pipeline for metazoa. As everything is in an Ensembl database the results can be browsed, and ad-hoc reports generated.

                  Comment

                  • staylor
                    Member
                    • Feb 2009
                    • 17

                    #10
                    Originally posted by whsqwghlm View Post
                    In terms of post-processing, We're loading the alignments into an Ensembl database so that we can screen for known genes and repeats. We then predict novel small RNAs, and estimate transcript counts for all loci based on read coverage. It's designed to be a generic pipeline for metazoa. As everything is in an Ensembl database the results can be browsed, and ad-hoc reports generated.
                    For some reason I didn't get emailed about the activity on my post so I thought no-one was interested! Looks like people have been thinking about it...

                    whsqwghlm - how did you get on with the mapping? Did the parameters work?

                    Comment

                    • whsqwghlm
                      Member
                      • Jun 2009
                      • 14

                      #11
                      Yes! We ended up using;
                      bowtie -n 0 -l 15 -e 99999 -k 200 --best --chunkmbs 128

                      We then post-processed the alignments to take the one with the longest 5' exact match (could not find a way to get bowtie to do this natively). The preparation of our library helped - it had been poly-A filled, and the 3' primer was terminated with a poly-T chain. We did not bother to poly-A trim the reads (i.e. remove the primer) as we did not want to lose any 'real' As of the end of sequences.

                      I'm still generating comparisons with other bowtie configs, and I also need to test the pipeline against a GEO data set with 'normal' primers.

                      Comment

                      • staylor
                        Member
                        • Feb 2009
                        • 17

                        #12
                        Ah excellent. I will try that. Thanks for the tip!

                        Comment

                        • bioinfosm
                          Senior Member
                          • Jan 2008
                          • 483

                          #13
                          Originally posted by whsqwghlm View Post
                          In terms of post-processing, We're loading the alignments into an Ensembl database so that we can screen for known genes and repeats. We then predict novel small RNAs, and estimate transcript counts for all loci based on read coverage. It's designed to be a generic pipeline for metazoa. As everything is in an Ensembl database the results can be browsed, and ad-hoc reports generated.
                          Are you using the mirBase for mapping, or the whole human genome?
                          --
                          bioinfosm

                          Comment

                          • whsqwghlm
                            Member
                            • Jun 2009
                            • 14

                            #14
                            We're aligning against the whole genome. Reads that do not align to the genome are aligned to mirBase (all species) just in case the assembly is incomplete.

                            Comment

                            • staylor
                              Member
                              • Feb 2009
                              • 17

                              #15
                              So are you filtering on the one with the smallest NM value with the longest read?

                              If you get multiple matches and they all score equally do you pick one at random?

                              Comment

                              Latest Articles

                              Collapse

                              • SEQadmin2
                                From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                                by SEQadmin2


                                Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                                The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                                ...
                                Yesterday, 10:05 AM
                              • SEQadmin2
                                Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
                                by SEQadmin2


                                With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


                                Introduction

                                Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
                                05-22-2026, 06:42 AM
                              • SEQadmin2
                                Environmental Genomics in the Age of NGS: From Microbes to Conservation Strategies
                                by SEQadmin2

                                Studying ecosystems means dealing with complex, multi-species communities that are hard to observe at scale. This complexity, however, hides many important questions to be answered, from how biogeochemical cycles work and how climate change can affect species distribution to how conservation strategies can work best.


                                Genomics, particularly since the expansion of NGS, has transformed ecosystem ecology. By sequencing environmental DNA, we can now assess biodiversity without direct...
                                05-06-2026, 09:04 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by SEQadmin2, Yesterday, 12:03 PM
                              0 responses
                              19 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, Yesterday, 11:40 AM
                              0 responses
                              14 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 05-28-2026, 11:40 AM
                              0 responses
                              29 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 05-26-2026, 10:12 AM
                              0 responses
                              31 views
                              0 reactions
                              Last Post SEQadmin2  
                              Working...