Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • ssully
    Member
    • Aug 2010
    • 48

    BLAST+ blastdbcmd batch file formatting

    db definition lines look like:

    >DS170424 | organism=Trichomonas_vaginalis_G3 | version=2007-01-11 | length=883
    >DS170425 | organism=Trichomonas_vaginalis_G3 | version=2007-01-11 | length=883
    >DS170426 | organism=Trichomonas_vaginalis_G3 | version=2007-01-11 | length=883

    [db was created from fasta records using makeblastdb (with parse-seqids)]

    Lines of batch input file (test.txt) to pull out subsequences look like:
    DS113177 1-10 plus
    DS113178 1-10 plus
    DS113179 1-10 plus

    [whitespace = tab (have also tried space, commas, and semicolon)]

    command line query:
    blastdbcmd -db TvaginalisGenomic_TrichDB-1.3.fasta -dbtype nucl -entry_batch test.txt

    result is a series of 'OID not found" errors.
    Error: DS113177 1-10 plus : OID not found
    Error: DS113178 1-10 plus : OID not found
    Error: DS113179 1-10 plus : OID not found
    BLAST query/options error: Entry not found in database

    Commandline query works if the batch file contains a list of JUST the sequence IDs (no range or strand info). In this case it returns the entire sequence for that ID. Query also works if I specify one seqID, range, strand e.g.:

    blastdbcmd -db TvaginalisGenomic_TrichDB-1.3.fasta -dbtype nucl -entry DS113177 -range 1-10 -strand plus

    So, what am I doing wrong? It seems to be something about line formatting in the input file. No guidance on this in the NCBI BLAST+ user manual.
  • Torst
    Senior Member
    • Apr 2008
    • 275

    #2
    Originally posted by ssully View Post
    Commandline query works if the batch file contains a list of JUST the sequence IDs (no range or strand info). In this case it returns the entire sequence for that ID. So, what am I doing wrong? It seems to be something about line formatting in the input file. No guidance on this in the NCBI BLAST+ user manual.
    Maybe I'm missing something, but I think the -entry_batch option is only MEANT to take one ID per line. That does work for me, and for you too.

    What made you think it could handle extra range/strand info? It doesn't say it does in the docs. And how would it know which parameters to apply your extra data to?

    Comment

    • ssully
      Member
      • Aug 2010
      • 48

      #3
      I would think pulling out subsequences by range and strand would be very common, such that columns two and three in an input file would be specified for range and strand. It didn't even occur to me that they would make the batch function so very limited as to ONLY work for sequence IDs.

      Comment

      • Torst
        Senior Member
        • Apr 2008
        • 275

        #4
        It's been that way since the batch mode was implemented for the old BLAST suite (via the "fastacmd" command). I can see your point about batch vs cmdline differences in capability.

        It's not that limiting, as you can still do one at a time on the command line. So if you are able to create the 3 column batch file in "A B C" format, then you similarly should be able to create a batch file in "-entry A -range B -strand C" format and use a shell command to apply it:

        % (for LINE in batch.txt ; do blastdbcmd -db mydb $LINE ; done) > output.fasta

        Problem solved.

        Comment

        • ssully
          Member
          • Aug 2010
          • 48

          #5
          Running this on a Windows command line, btw, so I wonder if the syntax would be different. I get "LINE was unexpected at the time" when I try to run that command on a file "'temp.txt" I created with lines that look like:

          -entry DS113177 -range 1-10 -strand plus
          -entry DS113177 -range 558-1093 -strand plus
          -entry DS113177 -range 1415-3062 -strand plus

          so I replaced tabs with commas and tried this on the command line

          for /F "tokens=*,delims=," %G IN temp.txt DO blastdbcmd -db [mydb] %G %H

          error is now
          "temp.txt was unexpected at the time"
          Last edited by ssully; 08-16-2012, 01:23 PM.

          Comment

          • Torst
            Senior Member
            • Apr 2008
            • 275

            #6
            Originally posted by ssully View Post
            Running this on a Windows command line, btw, so I wonder if the syntax would be different.
            I expect the syntax will be different! I am unable to assist with Windows/DOS batch files, sorry.

            Comment

            Latest Articles

            Collapse

            • GATTACAT
              Reply to Nine Things a Sample Prep Scientist Thinks About Before Sequencing
              by GATTACAT
              Love this - good data definitely starts from good input, and poor input can only give relatively poor data. I particularly like the mention of Nanodrop/absorbance based methods for quantification. It's such a toss up if you'll get an accurate reading or what amounts to a randomly generated number, and a lot of library/sequencing related issues can be traced back to poor quant.
              07-01-2026, 11:43 AM
            • SEQadmin2
              Nine Things a Sample Prep Scientist Thinks About Before Sequencing
              by SEQadmin2


              I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

              Here are nine questions we think about, in roughly the order they matter, before...
              06-18-2026, 07:11 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by SEQadmin2, 07-02-2026, 11:08 AM
            0 responses
            11 views
            0 reactions
            Last Post SEQadmin2  
            Started by SEQadmin2, 06-30-2026, 05:37 AM
            0 responses
            14 views
            0 reactions
            Last Post SEQadmin2  
            Started by SEQadmin2, 06-26-2026, 11:10 AM
            0 responses
            20 views
            0 reactions
            Last Post SEQadmin2  
            Started by SEQadmin2, 06-17-2026, 06:09 AM
            0 responses
            54 views
            0 reactions
            Last Post SEQadmin2  
            Working...