Seqanswers Leaderboard Ad



No announcement yet.
  • Filter
  • Time
  • Show
Clear All
new posts

  • No alias file for nr database?

    Hey all,

    I've been searching for anyone else with this problem, but I can't quite find the answer. I've installed Blast+ and I've used to add a local version of the nr database. This is my command:

    blastx -query ./2500_SFB_109258_length_12402_cov_5.509837.fasta -db /usr/local/Programs/ncbi-blast-2.2.28+/db/nr -out ./Scaffold_of_interest_Blastx.xml -evalue 1e-5 -outfmt 5

    But I keep getting this error:

    BLAST Database error: No alias or index file found for protein database [/usr/local/Programs/ncbi-blast-2.2.28+/db/nr] in search path [/usr/local/Programs/ncbi-blast-2.2.28+/db:]

    I've added the database folder path to the .ncbirc file, which I have in the home directory, and I know it works because I've added the refseq_protein and cdd_delta databases and they work just fine. Oddly enough, when I specify the path to the nr database in the command above, I get the same error. All nr database files are unzipped, and in the same folder as refseq_protein and cdd_delta databases. I'm stumped!

  • #2
    Do you have a file called nr.pal in /usr/local/Programs/ncbi-blast-2.2.28+/db/ ?


    • #3
      No I don't! I see I have a .pal file for the refseq database, so that must be the issue. Where should this file be coming from? One of the zipped folders on the ftp site?


      • #4
        It might not have downloaded perfectly--I would try using to redownload, and use the --decompress flag so you don't need to unzip them all manually. It should be in one of the nr files (the last one?)

        perl nr --decompress


        • #5
          Hi everyone!
          My problem is somehow realted to this described issue. I am using a refseq)protein database, downloaded from ncbi ftp, which consists of in total 9 folders with files .pni .pnd .pog and so on. When I am using the command
          $blastp -query ~/IIa.orfs.hmm.faa.db -db ~/refseq_protein -evalue 1e-5 -num_threads 60 -max_target_seqs 5 -outfmt 5 -out IIa.orfs.hmm.blast.xml

          I am getting an error:
          >>BLAST Database error: No alias or index file found for protein database [.../db/refseq_protein] in search path [.../software/multi-metagenome/]

          Any ideas what is happening?
          I even tried to use makeblastdb command, to format my databases, but it doesn't work as well.
          $ makeblastdb -in ~/refseq_protein.*.* -dbtype prot -out ~/db/refseq_protein.db
          >>Error: Too many positional arguments (1), the offending value: ~/db/refseq_protein.01.phr

          Need help!!!!



          • #6
            1) I suggest using full path names instead of '~'.

            2) To help troubleshoot cases of 'no file found' it is handy for us to see an 'ls' of the directory in question just to make sure you haven't done a mistake such as specifying the wrong directory.

            3) Pre-formatted refseq_protein should be 81 files -- no folders involved.


            • #7
              This is what I have done - there were everywhere specified full paths (I just eliminated them from the question). And yes, there are 81 files in the folder db in home directory...


              • #8
                Well, once again not seeing an 'ls' of your directory and not seeing the actual program line you are using (since you edited it), it becomes hard to troubleshoot the problem. Almost all of the time when someone posts about a file not being found it is because they are not using the correct path for the file despite what they think. In other words the file simply isn't there. I've done it about a zillion times myself.

                Going by your statement "there 81 files in the folder db in home directory" then your original blastp line is incorrect since you are *not* using the folder 'db in home directory'. Instead you are just using your home directory.

                Please check your paths. If nothing else do an:

                ls -l ~/refseq_protein* | head --lines=2

                And post that.


                • #9
                  No problem.
                  So, from previous post, here is the full paths included:
                  $ blastp -query /home/bwawrik/software/multi-metagenome/ -db /home/bwawrik/db/refseq_protein.* -evalue 1e-5 -num_threads 60 -max_target_seqs 5 -outfmt 5 -out IIa.orfs.hmm.blast.xml

                  And when I used the command:
                  $ls -l ~/refseq_protein* | head --lines=2

                  I got an error:
                  >>ls: cannot access home/bwawrik/db/refseq_protein.*: No such file or directory

                  I just cannot understand: if it doesn't "see" the files of database, how it gave an error during running of makeblastdb:
                  $makeblastdb -in /home/bwawrik/db/refseq_protein.* -dbtype prot -out /home/bwawrik/db/refseq_protein.db
                  >>Error: Too many positional arguments (1), the offending value: /home/bwawrik/db/refseq_protein.01.phr

                  Because from this, it seems that it CAN actually read the file, but is simply not "happy" with it.


                  • #10
                    Try just using

                    -db /home/bwawrik/db/refseq_protein

                    the full path, but only the prefix of the database name


                    • #11
                      Don't use a star (*) in your -db name. It should be:

                      -db /home/bwawrik/db/refseq_protein

                      Otherwise you are telling blastp that there are 81 (or so) files to use as the DB. It wants the overall name, not the overall files. Your initial blastp line did not have the star and thus it seemed correct except for the pathing problem. Your current blastp is obviously incorrect.

                      As for your makeblastdb error ... you are doing it wrong. I was going to mention that but it is not relevant to why blastp is not working. Once again you are telling the program to use 81 files. The program is basically seeing:

                      makeblastdb -in /home/bwawrik/db/refseq_protein.00.phr /home/bwawrik/db/ /home/bwawrik/db/refseq_protein.00.pnd ... etc.

                      Which of course ruins the one (1) parameter that should be after '-in' and brings up the 'too many positional arguments' error.

                      But as I said that is neither here nor there for running blastp. Let's not be concerned with makeblastdb.

                      Going on ... are you sure you ran that 'ls' that I gave you? I specified 'refseq_protein*' not the 'refseq_protein.*' (with a dot) that ls complained about.

                      Try the blastp without a star in the -db. And post the results of:

                      ls /home/bwawrik/db/refseq_protein* | head --lines=2


                      • #12
                        Ok, so far:
                        $ ls -l /home/bwawrik/db/refseq_protein* | head --lines=2
                        -rw-rw-r-- 1 bwawrik bwawrik 534122462 Dec 15 18:37 /home/bwawrik/db/refseq_protein.01.phr
                        -rw-rw-r-- 1 bwawrik bwawrik 23105152 Dec 15 18:37 /home/bwawrik/db/

                        and when using blastp without star:
                        BLAST Database error: No alias or index file found for protein database [/home/bwawrik/db/refseq_protein] in search path [/home/bwawrik/software/multi-metagenome/]


                        • #13
                          OK. Now we are getting somewhere -- at least I can be sure that the paths look correct. What I find strange is that your database files begin with *.01.* -- mine begin with *.00.*; e.g.,

                          More importantly we need to make sure that the overall index file is in place. Mine is at the bottom of the listing so that if I do a 'tail --lines=2' instead of using 'head' I get:

                          Or using 'ls -l'

                          -rw-r--r-- 1 braub diagrid-apps 275 Dec 15 20:12 /group/diagrid/databases/ncbi/week-04-2014/refseq_protein.pal
                          What do you get? I am trying to see if the '*.pal' file is present.


                          • #14
                            I know the problem , why there is not .00 file - I accidently deleted it.
                            I am downloading it now.
                            And for the checking of '*.pal', we have problems:
                            $ ls -l /home/bwawrik/db/refseq_protein* | tail --lines=2
                            -rw-r--r-- 1 bwawrik bwawrik 59 Jan 21 11:39 /home/bwawrik/db/refseq_protein.2.08.tar.gz.md5
                            -rw-r--r-- 1 bwawrik bwawrik 59 Jan 21 11:39 /home/bwawrik/db/refseq_protein.2.09.tar.gz.md5


                            • #15
                              Looks like your directory has extraneous files in it. They probably do not hurt. How about doing a

                              ls -l /home/bwawrik/db/refseq_protein*pal

                              Let's see if you have the overall index file.


                              Latest Articles


                              • seqadmin
                                Best Practices for Single-Cell Sequencing Analysis
                                by seqadmin

                                While isolating and preparing single cells for sequencing was historically the bottleneck, recent technological advancements have shifted the challenge to data analysis. This highlights the rapidly evolving nature of single-cell sequencing. The inherent complexity of single-cell analysis has intensified with the surge in data volume and the incorporation of diverse and more complex datasets. This article explores the challenges in analysis, examines common pitfalls, offers...
                                06-06-2024, 07:15 AM
                              • seqadmin
                                Latest Developments in Precision Medicine
                                by seqadmin

                                Technological advances have led to drastic improvements in the field of precision medicine, enabling more personalized approaches to treatment. This article explores four leading groups that are overcoming many of the challenges of genomic profiling and precision medicine through their innovative platforms and technologies.

                                Somatic Genomics
                                “We have such a tremendous amount of genetic diversity that exists within each of us, and not just between us as individuals,”...
                                05-24-2024, 01:16 PM





                              Topics Statistics Last Post
                              Started by seqadmin, 06-17-2024, 06:54 AM
                              0 responses
                              Last Post seqadmin  
                              Started by seqadmin, 06-14-2024, 07:24 AM
                              0 responses
                              Last Post seqadmin  
                              Started by seqadmin, 06-13-2024, 08:58 AM
                              0 responses
                              Last Post seqadmin  
                              Started by seqadmin, 06-12-2024, 02:20 PM
                              0 responses
                              Last Post seqadmin