Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • Ben Langmead
    Senior Member
    • Sep 2008
    • 200

    Originally posted by apostrophe View Post
    ...does Bowtie support FASTA nucleic acid codes that code for two bases, such as Y = T or C for the genome? Thanks in advance.
    Bowtie will index and align against references containing non-A/C/G/T characters, but alignments overlapping non-A/C/G/T characters in the reference are invalid and won't be reported.

    Out of curiosity, what's the behavior you would like? E.g. if a C in a read were to align against a Y in the genome, would you like that to be considered a match, incurring no penalty against the alignment?

    Thanks,
    Ben

    Comment

    • apostrophe
      Junior Member
      • Jul 2009
      • 2

      I was hoping to use Bowtie in order to align a large amount of reads against a genome that has SNPs in the stated format above. If not, I suppose I'll have to figure out some other method of alignment.

      Thanks for your quick reply!

      Comment

      • seq_GA
        Senior Member
        • Feb 2009
        • 124

        Hi Ben,

        Thanks for support.

        I am trying to compare the eland and Bowtie results. Many reads are not getting mapped using Bowtie where as eland reports as unique tags without any mismatch. An example would be as follows:

        Code:
        >read1 AGTCTGTTTATGTTCAGCACAATTTTTTTTTTTTG  U0  1   0  0  chr8.fa 37178235  R DD
        Where as Bowtie result for the above read is as follows:
        Code:
        ./bowtie -a -m 10 -n 2 --strata --best -p 15 ../Genome/hg18/hg18 -c AGTCTGTTTATGTTCAGCACAATTTTTTTTTTTTG
        No results
        I have build the reference genome with default parameters.
        Code:
        ./bowtie-build <reference_in> <index_baename>
        Why Bowtie is not reporting the mapping?
        Please let me know whether any changes in the parameters needs to be done.

        And also my query would be how Bowtie handles if there are "N"s in the query reads?

        Thanks.

        Comment

        • Ben Langmead
          Senior Member
          • Sep 2008
          • 200

          Hi seq_GA,

          Originally posted by seq_GA View Post
          I am trying to compare the eland and Bowtie results. Many reads are not getting mapped using Bowtie where as eland reports as unique tags without any mismatch. An example would be as follows:

          Code:
          >read1 AGTCTGTTTATGTTCAGCACAATTTTTTTTTTTTG  U0  1   0  0  chr8.fa 37178235  R DD
          Where as Bowtie result for the above read is as follows:
          Code:
          ./bowtie -a -m 10 -n 2 --strata --best -p 15 ../Genome/hg18/hg18 -c AGTCTGTTTATGTTCAGCACAATTTTTTTTTTTTG
          No results
          Can you confirm that it ought to align by looking at the reference? I don't have the hg18 index lying around, but in the h_sapiens_asm index, your example aligns uniquely with 3 mismatches:

          Code:
          ./bowtie -a -v 3 /fs/szasmg/langmead/ebwts/h_sapiens_asm -c AGTCTGTTTATGTTCAGCACAATTTTTTTTTTTTG
          0	-	gi|51511724|ref|NC_000008.9|NC_000008	37178227	CAAAAAAAAAAAATTGTGCTGAACATAAACAGACT	IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII	0	31:G>A,33:C>A,34:T>C
          Reported 1 alignments to 1 output stream(s)
          Also, if you want the output to look like Eland, you should use -v 2 instead of -n 2. -n 2 activates a Maq-like alignment policy.

          And also my query would be how Bowtie handles if there are "N"s in the query reads?
          An N in a read counts always counts as a mismatch in the alignment.

          Thanks,
          Ben

          Comment

          • plattsa
            Member
            • Mar 2009
            • 17

            >read1 AGTCTGTTTATGTTCAGCACAATTTTTTTTTTTTG U0 1 0 0 chr8.fa 37178235 R DD
            I'm not sure, but didn't the earlier eland only report mismatches over the first 32 bases? Hence mismatches in the final base reads would still allow a U0?

            Comment

            • seq_GA
              Senior Member
              • Feb 2009
              • 124

              Hi Ben,

              Thanks for your prompt response.

              with -v 3, Bowtie is also reporting one mapping location.

              I want to use seedlength as 28(default) with 2 mismatches. hence I used -n 2 since I am comparing eland_28 and Bowtie results.

              But still why Bowtie is not reporting?
              Last edited by seq_GA; 07-14-2009, 11:47 PM.

              Comment

              • seq_GA
                Senior Member
                • Feb 2009
                • 124

                Hi Ben,
                I did a quick comparison on with -v 2 and -n 2.

                The reads are 35bps length and i used -3 6 to trim 3` sequences and hence my mappabale reads would be 28 in size in order for me to compare eland_28 results.

                Code:
                 bowtie -a -m 10 -v 2 --strata --best --solexa-quals  -p 15 -3 ../../Genome/hg18/hg18 ../s_1_sequence.txt out_aln.txt

                When I look at the unque ly mapped tags with -v 2 is more than with -n 2.

                Can you please explain me why there are more number of mapping when -v 2?

                Thanks.

                Comment

                • Ben Langmead
                  Senior Member
                  • Sep 2008
                  • 200

                  Originally posted by seq_GA View Post
                  with -v 3, Bowtie is also reporting one mapping location.

                  I want to use seedlength as 28(default) with 2 mismatches. hence I used -n 2 since I am comparing eland_28 and Bowtie results.

                  But still why Bowtie is not reporting?
                  Probably because the -e limit is disqualifying that alignment. If you'd like Bowtie to report alignments like that, try setting a higher -e than the default (70). -e is described in the Maq-like Policy section of the manual.

                  Ben

                  Comment

                  • Ben Langmead
                    Senior Member
                    • Sep 2008
                    • 200

                    Originally posted by seq_GA View Post
                    Can you please explain me why there are more number of mapping when -v 2?
                    Probably the -e limit again. See my previous post.

                    Ben

                    Comment

                    • frozenlyse
                      Senior Member
                      • Sep 2008
                      • 135

                      Originally posted by Ben Langmead View Post
                      Bowtie will index and align against references containing non-A/C/G/T characters, but alignments overlapping non-A/C/G/T characters in the reference are invalid and won't be reported.

                      Out of curiosity, what's the behavior you would like? E.g. if a C in a read were to align against a Y in the genome, would you like that to be considered a match, incurring no penalty against the alignment?

                      Thanks,
                      Ben
                      The reason out group would like this functionality is because we are investigating performing DNA methylation analysis via illumina bisulfite sequencing -> in this case C nucleotides in the normal genome will either be C or T nucleotides in the bisulfute converted genome.

                      So our preferred behavior would be to not penalise either the C or T (if the reference contained a Y at this position)

                      Anyway I find bowtie very useful, thanks for all your work!

                      Comment

                      • Ben Langmead
                        Senior Member
                        • Sep 2008
                        • 200

                        Hi Chuck,

                        Originally posted by chuck View Post
                        I tried bowtie remade with extraflags but it just did the same thing. Would there be a log file somewhere or something in the map file? I can't seem to find any additional output.
                        If you have a moment, could you try your run again using the latest version of Bowtie (0.10.1, released on Monday).

                        Thanks,
                        Ben

                        Comment

                        • seq_GA
                          Senior Member
                          • Feb 2009
                          • 124

                          Originally posted by Ben Langmead View Post
                          Probably the -e limit again. See my previous post.

                          Ben
                          Hi Ben,

                          I am trying to get as many mapping as eland reports and trying to play around with Bowtie's parameters.
                          As you had suggested earlier, I tried using -e till 2000 to increase the mapping as good as eland but still Bowtie misses a lot of mappings when compared to eland.

                          -v option would give a comparable results ( I tested for read length 28 which is also the seed length) as eland but with the increasing number of Ns in the 3`end, it would be good to use -n option and try to allow any number of mismatches beyond seed length.

                          And hence any suggestions to increase the mapping rate of Bowtie using -n options?

                          Thanks.

                          Comment

                          • Ben Langmead
                            Senior Member
                            • Sep 2008
                            • 200

                            Originally posted by seq_GA View Post
                            And hence any suggestions to increase the mapping rate of Bowtie using -n options?
                            The main options used to adjust the sensitivity of mapping in Maq-like alignment mode are -n, -l, -e, --maxbts/-y. If there is a particular alignment you think Bowtie should be finding but isn't, please let me know and I can take a look.

                            Thanks,
                            Ben

                            Comment

                            • chuck
                              Member
                              • Apr 2009
                              • 13

                              Hi Ben,

                              I've been teaching and not working on the data lately. I will give it a try soon.

                              I have a question for you about assembly quality evaluation, in two contexts.

                              1) to simply evaluate the quality of the assembly of the short reads against the reference sequences, beyond simple coverage
                              2) when there are actual differences between the sequenced genome and the reference genome, in finding indels and whatnot

                              I am looking AMOS, which seems to be one of the few that provide some kind of quality score for the assembly. Are you aware of others?

                              I am trying to quickly narrow my analysis down to those de novo contigs with good assembly scores. I proposed a simple metric in a manuscript and the reviewer suggested I use other 'standard' measures but gave no pointers as to which ones I should be using. Things are changing so fast it is hard to keep track of the 'standard'...

                              Thanks,
                              Chuck

                              Comment

                              • chuck
                                Member
                                • Apr 2009
                                • 13

                                Ben,

                                I used the latest version 0.10.1 and it still hangs. It seems to complete the job (or almost, I haven't verified that fact yet) and stops writing to the output file but then it never closes.

                                Do you want me to run the debug version or try the extra flags again?

                                Chuck

                                Comment

                                Latest Articles

                                Collapse

                                • SEQadmin2
                                  From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                                  by SEQadmin2


                                  Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                                  The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                                  ...
                                  06-02-2026, 10:05 AM
                                • SEQadmin2
                                  Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
                                  by SEQadmin2


                                  With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


                                  Introduction

                                  Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
                                  05-22-2026, 06:42 AM
                                • SEQadmin2
                                  Environmental Genomics in the Age of NGS: From Microbes to Conservation Strategies
                                  by SEQadmin2

                                  Studying ecosystems means dealing with complex, multi-species communities that are hard to observe at scale. This complexity, however, hides many important questions to be answered, from how biogeochemical cycles work and how climate change can affect species distribution to how conservation strategies can work best.


                                  Genomics, particularly since the expansion of NGS, has transformed ecosystem ecology. By sequencing environmental DNA, we can now assess biodiversity without direct...
                                  05-06-2026, 09:04 AM

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by SEQadmin2, 06-02-2026, 12:03 PM
                                0 responses
                                21 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, 06-02-2026, 11:40 AM
                                0 responses
                                14 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, 05-28-2026, 11:40 AM
                                0 responses
                                29 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, 05-26-2026, 10:12 AM
                                0 responses
                                31 views
                                0 reactions
                                Last Post SEQadmin2  
                                Working...