Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Hi,

    I just updated bowtie from version 0.11.3 to 0.12.2. With version 0.11.3 I was able to run the command "bowtie -m 25 -a -n 15 --un <file> -p 4 <ebwt> <infile> <outfile>". When I run this command in version 0.12.2, I get error "-n/--seedmms arg must be at least 0 and at most 3". Am I missing something in the change log about this parameter? Is the behavior of -n in version 0.11.3 accurate?

    Thank you.

    EDIT: I just realized that while version 0.11.3 will let me give -n greater than 3, it is still capped at -n 3. Is it possible to align with more than 3 mismatches? I am using bowtie to align 75bp reads to a genomic model (coding regions only) with the ultimate goal of calculating RPKM for each of the models. Is bowtie simply the wrong tool for this purpose?
    Last edited by bloomfi1; 02-15-2010, 05:34 PM.

    Comment


    • -v <int> report end-to-end hits w/ <=v mismatches; ignore qualities
      or
      -n/--seedmms <int> max mismatches in seed (can be 0-3, default: -n 2)
      -e/--maqerr <int> max sum of mismatch quals across alignment for -n (def: 70)
      -l/--seedlen <int> seed length for -n (default: 28)
      -v for end-to-end mismatches
      -n only for mismatches in the seed region, and you can specify the seed length by '-l'
      Xi Wang

      Comment


      • Both -v and -n have a maximum size of 3. What is the reason for this restriction?

        Comment


        • if you are using reads of length 75, would you change the seed length or bowtie figures that out?

          I can only align around 50% of my single read Illumina data from this paper using bowtie default setting : http://www.nature.com/nmeth/journal/...meth.1226.html

          Anyone knows what parameters to tweak to get more sequences aligned?

          Comment


          • I guess that you should trim your data and try to align your sequences again. Also, I don't think that "bowtie figures it out", though I'm no expert.
            L. Collado Torres, Ph.D. student in Biostatistics.

            Comment


            • Originally posted by bloomfi1 View Post
              Hi,

              I just updated bowtie from version 0.11.3 to 0.12.2. With version 0.11.3 I was able to run the command "bowtie -m 25 -a -n 15 --un <file> -p 4 <ebwt> <infile> <outfile>". When I run this command in version 0.12.2, I get error "-n/--seedmms arg must be at least 0 and at most 3". Am I missing something in the change log about this parameter? Is the behavior of -n in version 0.11.3 accurate?

              Thank you.

              EDIT: I just realized that while version 0.11.3 will let me give -n greater than 3, it is still capped at -n 3. Is it possible to align with more than 3 mismatches? I am using bowtie to align 75bp reads to a genomic model (coding regions only) with the ultimate goal of calculating RPKM for each of the models. Is bowtie simply the wrong tool for this purpose?
              Hi,

              Yes, the problem was that versions < 0.12.2 were failing to check for a too-high input for -n and -v. The manual and the usage message both said max=3, but bowtie erroneously didn't enforce it.

              Note that the -n option only constrains the number of mismatches in the seed, not in the entire alignment. The key is to set -n, -l and -e to reasonable numbers given your data. Since your reads are 75bp, I would suggest trying a few different settings, perhaps starting with -l 28 (the default) -n 2 and -e 180 and then adjusting all 3 until your getting your desired mix of speed and sensitivity.

              Thanks,
              Ben

              Comment


              • I am fairly new to the field of next-gen sequencing but find Bowtie to be fairly user friendlybut I do have a question regarding its use. What is the difference in reporting between the default bowtie and the use of the -a, --strata, and --best flags? I understand that with the flags all of the alignments are reported in a best to work format but what does the default bowtie report? For human sequencing data is there a best set of parameters to use in order to gain enough sensitivity in coverage while keeping the file sizes to a manageable number?
                thanks in advance for any help.

                Comment


                • Originally posted by Ben Langmead View Post
                  Hi,

                  Yes, the problem was that versions < 0.12.2 were failing to check for a too-high input for -n and -v. The manual and the usage message both said max=3, but bowtie erroneously didn't enforce it.

                  Note that the -n option only constrains the number of mismatches in the seed, not in the entire alignment. The key is to set -n, -l and -e to reasonable numbers given your data. Since your reads are 75bp, I would suggest trying a few different settings, perhaps starting with -l 28 (the default) -n 2 and -e 180 and then adjusting all 3 until your getting your desired mix of speed and sensitivity.

                  Thanks,
                  Ben
                  Hello and thank you for the advice. I am wondering about the maximum setting of 3, though. I have looked at the bowtie source a little bit and get the impression that this restriction is possibly an inherent restriction in the overall design of bowtie. Is this accurate? Otherwise, do you have any plans to increase this number in the future?

                  Thank you,
                  Sean

                  Comment


                  • Bowtie quality values error

                    Hello everyone,

                    We have been using MAQ for our Solexa assembly needs, but we're moving to another program for downstream analysis, and Bowtie seems much easier for upstream assembly. Unfortunately, this means learning another assembly program. I was trying to assemble some data that we have previously assembled and analyzed using MAQ using Bowtie and I'm running into an error I don't really understand. It states "Reads file contained a pattern with more than 1024 quality values." I'm using the -n alignment mode to assemble the paired alignments (and including the input option --solexa-quals), but have also tried in -v alignment mode (which I thought ignored quality values). We didn't have any issues assembling this data with MAQ, so I think I'm just missing something being new to Bowtie. Any help anyone can provide would be greatly appreciated.

                    Thanks

                    Comment


                    • Can you please post the Bowtie version you're using, and the command you used to run it?

                      Thanks,
                      Ben

                      Comment


                      • Originally posted by RichEast View Post
                        Hello everyone,

                        We have been using MAQ for our Solexa assembly needs, but we're moving to another program for downstream analysis, and Bowtie seems much easier for upstream assembly. Unfortunately, this means learning another assembly program. I was trying to assemble some data that we have previously assembled and analyzed using MAQ using Bowtie and I'm running into an error I don't really understand. It states "Reads file contained a pattern with more than 1024 quality values." I'm using the -n alignment mode to assemble the paired alignments (and including the input option --solexa-quals), but have also tried in -v alignment mode (which I thought ignored quality values). We didn't have any issues assembling this data with MAQ, so I think I'm just missing something being new to Bowtie. Any help anyone can provide would be greatly appreciated.

                        Thanks
                        I have seen this error when the number of bases does not equal the number of quality values in the fastq file. Assuming that isn't the problem it most likely has something to do with bowtie expecting a range of quality values that are not present in your fastq file. Which version of the Illumina pipeline did this data come from?

                        Comment


                        • Originally posted by Ben Langmead View Post
                          Can you please post the Bowtie version you're using, and the command you used to run it?

                          Thanks,
                          Ben
                          We're using Bowtie version 0.12.3, with the command line (running on a command prompt in windows) "Bowtie -n 2 -q --solexa1.3-quals -S Pbindex -1QN_read1 -2QN_read2 QNalign.sam" The FASTQ files are run off a Illunima GA II pipeline 1.4. Thanks.

                          rich

                          Comment


                          • Originally posted by RichEast View Post
                            We're using Bowtie version 0.12.3, with the command line (running on a command prompt in windows) "Bowtie -n 2 -q --solexa1.3-quals -S Pbindex -1QN_read1 -2QN_read2 QNalign.sam" The FASTQ files are run off a Illunima GA II pipeline 1.4. Thanks.

                            rich
                            Could you please paste a head of your data as the bowtie input here?
                            Xi Wang

                            Comment


                            • Originally posted by RichEast View Post
                              We're using Bowtie version 0.12.3, with the command line (running on a command prompt in windows) "Bowtie -n 2 -q --solexa1.3-quals -S Pbindex -1QN_read1 -2QN_read2 QNalign.sam" The FASTQ files are run off a Illunima GA II pipeline 1.4. Thanks.

                              rich
                              Hi Rich,

                              Another user just contacted me via email and described something similar. When I ran their reads through bowtie, I realized that part of the problem is that Bowtie is printing the wrong error message. In their case, the error message should have been something more like "Too many quality values for read..." because they had a fastq entry where the quality string was 2 characters longer than the sequence string. Do you notice any inconsistencies like that in your input?

                              I'll fix the error-message bug.

                              Thanks,
                              Ben

                              Comment


                              • Originally posted by Ben Langmead View Post
                                Hi Rich,

                                Another user just contacted me via email and described something similar. When I ran their reads through bowtie, I realized that part of the problem is that Bowtie is printing the wrong error message. In their case, the error message should have been something more like "Too many quality values for read..." because they had a fastq entry where the quality string was 2 characters longer than the sequence string. Do you notice any inconsistencies like that in your input?

                                I'll fix the error-message bug.

                                Thanks,
                                Ben
                                Ben,

                                That seems to be a likely problem. We took the first 20 or so paired reads and verified the sequence and quality value lengths, and that ran well, with the same command line. We'll go through the FASTQ files and try and find the quality string causing us problems. Thanks to everyone for the helpful suggestions.

                                rich

                                Comment

                                Latest Articles

                                Collapse

                                • seqadmin
                                  Choosing Between NGS and qPCR
                                  by seqadmin



                                  Next-generation sequencing (NGS) and quantitative polymerase chain reaction (qPCR) are essential techniques for investigating the genome, transcriptome, and epigenome. In many cases, choosing the appropriate technique is straightforward, but in others, it can be more challenging to determine the most effective option. A simple distinction is that smaller, more focused projects are typically better suited for qPCR, while larger, more complex datasets benefit from NGS. However,...
                                  10-18-2024, 07:11 AM
                                • seqadmin
                                  Non-Coding RNA Research and Technologies
                                  by seqadmin




                                  Non-coding RNAs (ncRNAs) do not code for proteins but play important roles in numerous cellular processes including gene silencing, developmental pathways, and more. There are numerous types including microRNA (miRNA), long ncRNA (lncRNA), circular RNA (circRNA), and more. In this article, we discuss innovative ncRNA research and explore recent technological advancements that improve the study of ncRNAs.

                                  Nobel Prize for MicroRNA Discovery
                                  This week,...
                                  10-07-2024, 08:07 AM

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by seqadmin, Yesterday, 05:31 AM
                                0 responses
                                10 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 10-24-2024, 06:58 AM
                                0 responses
                                20 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 10-23-2024, 08:43 AM
                                0 responses
                                48 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 10-17-2024, 07:29 AM
                                0 responses
                                58 views
                                0 likes
                                Last Post seqadmin  
                                Working...
                                X