Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • BioTalk
    Member
    • Feb 2010
    • 43

    #16
    Originally posted by rglover View Post
    What are the names of the database files that formatdb is creating? Could you list them here? You could also try putting "-o T" on the end of your formatdb. Other than that I'm not really sure!
    The list of file created by formatdb are:
    .nhr, .nin, .nsq, .nsd, .nsi

    Comment

    • maubp
      Peter (Biopython etc)
      • Jul 2009
      • 1544

      #17
      Originally posted by BioTalk View Post
      The list of file created by formatdb are:
      .nhr, .nin, .nsq, .nsd, .nsi
      What are the full names? This is important for what you tell blast, e.g. for example.nin etc tell blast the database name is example, but for example.fas.nin etc the database name is example.fas instead.

      Comment

      • rglover
        rg
        • Dec 2008
        • 51

        #18
        Hiya. Looking at the link you posted for where you downloaded your executables, you're definitely using BLAST+, so blastall (probably) won't work well.
        If you try formatting your database with:

        makeblastdb -in yourfasta.fasta -dbtype nucl

        that should format your database for use with the Blast+ executables. I've managed to use formatdb/makeblastdb interchangeably between blast/blast+ in the past, but on Windows so you never know, it might error on Linux.
        If you then use the "blastn -db <yourfasta.fasta> -word_size" etc convention for starting your blast it might work.

        Comment

        • BioTalk
          Member
          • Feb 2010
          • 43

          #19
          Thank you all very much! Now, I am able to generate following type of Blast output file:
          Which is a huge file because of the repetition of the information. Does anyone know how can we get it in any other format?

          BLASTN 2.2.21 [Jun-14-2009]


          Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer,
          Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997),
          "Gapped BLAST and PSI-BLAST: a new generation of protein database search
          programs", Nucleic Acids Res. 25:3389-3402.

          Query= 1-72342
          (20 letters)

          Database:/home//Desktop/mma.faa
          15,632 sequences; 339,921 total letters

          Searching..................................................done

          ***** No hits found ******


          BLASTN 2.2.21 [Jun-14-2009]


          Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer,
          Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997),
          "Gapped BLAST and PSI-BLAST: a new generation of protein database search
          programs", Nucleic Acids Res. 25:3389-3402.

          Query= 2-55421
          (19 letters)

          Database: /home//Desktop/mma.fa
          15,632 sequences; 339,921 total letters

          Searching..................................................done

          ***** No hits found ******


          BLASTN 2.2.21 [Jun-14-2009]


          Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer,
          Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997),
          "Gapped BLAST and PSI-BLAST: a new generation of protein database search
          programs", Nucleic Acids Res. 25:3389-3402.

          Query= 3-46574
          (21 letters)

          Database: /home/Desktop/mma.fa
          15,632 sequences; 339,921 total letters

          Searching..................................................done

          ***** No hits found ******


          BLASTN 2.2.21 [Jun-14-2009]


          Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer,
          Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997),
          "Gapped BLAST and PSI-BLAST: a new generation of protein database search
          programs", Nucleic Acids Res. 25:3389-3402.

          Query= 4-38013
          (17 letters)

          Database: /home//Desktop/mma.fa
          15,632 sequences; 339,921 total letters

          Searching..................................................done

          ***** No hits found ******

          Comment

          • rglover
            rg
            • Dec 2008
            • 51

            #20
            if you have a look in the blast+ manual there's some formatting guidelines for tabulating the data output etc. Glad you got it working!

            Comment

            • BioTalk
              Member
              • Feb 2010
              • 43

              #21
              Thanks to you! Sure I will have look into blast+ manual.

              Comment

              • westerman
                Rick Westerman
                • Jun 2008
                • 1104

                #22
                Also it looks like your query sequences are very short (20 bp). You will probably have take this into consideration via non-default command line parameters.

                Comment

                • BioTalk
                  Member
                  • Feb 2010
                  • 43

                  #23
                  Originally posted by westerman View Post
                  Also it looks like your query sequences are very short (20 bp). You will probably have take this into consideration via non-default command line parameters.
                  Yes, my query sequences are shorter than 20bp. What non default commands do I need to use?

                  Comment

                  • westerman
                    Rick Westerman
                    • Jun 2008
                    • 1104

                    #24
                    For short sequences and for blast+ then using the commands 'blastn-short' or 'megablast' will be preferable to the regular commands. If those commands are not directly available then run 'blastn' with the command line option '-task blastn-short' or '-task megablast'.

                    There may be other options that I am unaware of since I do not do many short sequence alignments. The most important concept is to simply be aware that blast is generally used to align longer sequence and that at 20-bp you are getting close to the window sizes that blast uses. Blast, like many tools, is not something to use without some thought.

                    Comment

                    • robs
                      Senior Member
                      • May 2010
                      • 116

                      #25
                      If you expect errors in your sequences or want to look for more distant relationships, you might want to lower the seed length (default of 11; try 6-8; parameter -W). BLAST(+) also filters regions for low complexity and your short sequences might be filtered out before any alignment. You can turn off the filtering and see if it makes any differences (-nofilter).

                      Comment

                      • BioTalk
                        Member
                        • Feb 2010
                        • 43

                        #26
                        Does anyone know how to get an output file in Blast with only the details of aligned regions?

                        Because I am trying to compare two files with any fasta sequences in it and I am getting huge file with match as well as not matched regions.

                        Comment

                        • rglover
                          rg
                          • Dec 2008
                          • 51

                          #27
                          If you use this command you'll only get the alignments:
                          -num_descriptions 0 -num_alignments <however-many-you-want>

                          You'll still get an output for the sequences where no matches have been found though. You could also try using BioPerl to process the Blast results.

                          Comment

                          • BioTalk
                            Member
                            • Feb 2010
                            • 43

                            #28
                            Originally posted by rglover View Post
                            If you use this command you'll only get the alignments:
                            -num_descriptions 0 -num_alignments <however-many-you-want>

                            You'll still get an output for the sequences where no matches have been found though. You could also try using BioPerl to process the Blast results.
                            I tried -num_descriptions 0 -num_alignment 1 -outfmt 0, but I am still getting all the matched and unmatched regions in the same file.

                            Comment

                            • rglover
                              rg
                              • Dec 2008
                              • 51

                              #29
                              Just to clarify - you're getting the one alignment that you want, but you're also getting the "No hits found" ones too?
                              If that's the case, you could use BioPerl to go through the file and then choose to only print out the ones that have hits to a new file.

                              Comment

                              • BioTalk
                                Member
                                • Feb 2010
                                • 43

                                #30
                                Originally posted by rglover View Post
                                Just to clarify - you're getting the one alignment that you want, but you're also getting the "No hits found" ones too?
                                If that's the case, you could use BioPerl to go through the file and then choose to only print out the ones that have hits to a new file.
                                Yes, that's correct. But the output file generated is of random pattern which makes it more difficult for me to extract only aligned regions. Below if the example of the file.

                                Please let me know if anyone knows how to deal with this. Thank you!

                                BLASTN 2.2.23+


                                Reference: Stephen F. Altschul, Thomas L. Madden, Alejandro A.
                                Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J.
                                Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of
                                protein database search programs", Nucleic Acids Res. 25:3389-3402.



                                Database: Desktop/RNA.fa
                                15,632 sequences; 339,921 total letters



                                Query= 1-72342
                                Length=20


                                ***** No hits found *****



                                Lambda K H
                                0.634 0.408 0.912

                                Gapped
                                Lambda K H
                                0.625 0.410 0.780

                                Effective search space used: 956935


                                Query= 2-55421
                                Length=19


                                ***** No hits found *****



                                Lambda K H
                                0.634 0.408 0.912

                                Gapped
                                Lambda K H
                                0.625 0.410 0.780

                                Effective search space used: 1066359


                                Query= 3-46574
                                Length=21
                                Score E
                                Sequences producing significant alignments: (Bits) Value



                                >lcl|zma-miR159k MIMAT0013980 Zea mays miR159k
                                Length=21

                                Score = 39.2 bits (42), Expect = 1e-06
                                Identities = 21/21 (100%), Gaps = 0/21 (0%)
                                Strand=Plus/Plus

                                Query 1 TTTGGATTGAAGGGAGCTCTG 21
                                |||||||||||||||||||||
                                Sbjct 1 TTTGGATTGAAGGGAGCTCTG 21


                                >lcl|
                                MIMAT0013979 Zea mays miR159j
                                Length=21

                                Score = 39.2 bits (42), Expect = 1e-06
                                Identities = 21/21 (100%), Gaps = 0/21 (0%)
                                Strand=Plus/Plus

                                Query 1 TTTGGATTGAAGGGAGCTCTG 21
                                |||||||||||||||||||||
                                Sbjct 1 TTTGGATTGAAGGGAGCTCTG 21


                                >lcl|zma-miR159f MIMAT0013975 Zea mays miR159f
                                Length=21

                                Score = 39.2 bits (42), Expect = 1e-06
                                Identities = 21/21 (100%), Gaps = 0/21 (0%)
                                Strand=Plus/Plus

                                Query 1 TTTGGATTGAAGGGAGCTCTG 21
                                |||||||||||||||||||||
                                Sbjct 1 TTTGGATTGAAGGGAGCTCTG 21


                                >lcl|tae-miR159b MIMAT0005344 Triticum aestivum miR159b
                                Length=21

                                Score = 39.2 bits (42), Expect = 1e-06
                                Identities = 21/21 (100%), Gaps = 0/21 (0%)
                                Strand=Plus/Plus

                                Query 1 TTTGGATTGAAGGGAGCTCTG 21
                                |||||||||||||||||||||
                                Sbjct 1 TTTGGATTGAAGGGAGCTCTG 21

                                Comment

                                Latest Articles

                                Collapse

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by SEQadmin2, 06-05-2026, 10:09 AM
                                0 responses
                                14 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, 06-04-2026, 08:59 AM
                                0 responses
                                24 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, 06-02-2026, 12:03 PM
                                0 responses
                                29 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, 06-02-2026, 11:40 AM
                                0 responses
                                23 views
                                0 reactions
                                Last Post SEQadmin2  
                                Working...