Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • blastall output: NCBI vs command line

    Sorry for basic question. I wonder is there any options in blastall I can use in order to get output in the same format as we get it using blast online on the NCBI website?
    Specifically, I want to get "Sequences producing significant alignments:" line that contains columns: Accession Description Max score Total score Query coverage E value Max ident


    But using blastall from command line
    e.g. blastall -p blastx -i input.fa -d /blast/db/nr -a 4 -b 5 -v 5 -e 1e-20 -o output.file
    I get only Accession Description Score (bits) E value columns.


    However, I want also to get Query coverage and Max ident columns.
    I didn't find solution in the blastall manual. Perhaps, it depends on the parameter -m, but there are many options...
    Thanks in advance!

    UPD: -m 9 (tabular with comment lines (post-processed, sorted) view) produces almost what I want, but it gives only an Accession ID without description. And something like gi|66734174|gb|AAY53484.1| isn't very helpful.
    Last edited by ElMichael; 03-09-2011, 02:26 PM.

  • #2
    Originally posted by ElMichael View Post
    Sorry for basic question. I wonder is there any options in blastall I can use in order to get output in the same format as we get it using blast online on the NCBI website?
    The NCBI website is now using BLAST+ rather than 'legacy' BLAST. So one thing to do would be to switch from using 'legacy' blastall binary to the blastx binary. Note that with BLAST+ you can request lots of extra columns in the tabular output - that may cover what you want.

    Comment


    • #3
      I haven't tried Blast+ yet, but in the past we have used a combination of
      blastx -m 8
      and blastx (without the -m parameter)
      to get coverage and protein hit names.

      Comment


      • #4
        I haven't tried Blast+ yet, but in the past we have used a combination of
        blastx -m 8
        and blastx (without the -m parameter)
        to get coverage and protein hit names.

        Comment


        • #5
          maubp, colindaven, thanks for your advice!
          I tried the blast+, but, unfortunately, the number of supported format specifiers doesn't include Description of subject (I wonder why?!) and Query coverage (though it could be calculated, but again why?!).
          I think, I have to use combination of two blastx runs as colindaven suggested.
          (Though still hope that there is some unknown to me magic option that produces required format).

          Comment


          • #6
            Originally posted by ElMichael View Post
            maubp, colindaven, thanks for your advice!
            I tried the blast+, but, unfortunately, the number of supported format specifiers doesn't include Description of subject (I wonder why?!) and Query coverage (though it could be calculated, but again why?!).
            I'd like to be able to have query length and subject length as output columns (which then makes either percentage coverage easily calculated).
            Originally posted by ElMichael View Post
            I think, I have to use combination of two blastx runs as colindaven suggested.
            (Though still hope that there is some unknown to me magic option that produces required format).
            You don't have to do that, run BLAST+ once with ASN.1 output, then use blast_formatter to turn this into any of the output formats (text, html, xml, tabular).

            Comment


            • #7
              This sounds like a job for Bio::SearchIO. However you have to be very comfortable already with BioPerl.

              Comment


              • #8
                Originally posted by maubp View Post
                You don't have to do that, run BLAST+ once with ASN.1 output, then use blast_formatter to turn this into any of the output formats (text, html, xml, tabular).
                Thanks for the hint.

                kmcarr, that works terrific! Exactly, what I wanted. Thank you.

                Comment


                • #9
                  follow up blast+

                  Hello,
                  I met similar case to blast. Not familiar with the blast+ though. Anyway, I tried:
                  Code:
                  blastall -p blastx -i all-EST-cleaned.fasta -d my-db -m 9 -B 3 -b 10  -o blast-output.txt
                  and I got the result,
                  Code:
                  # Fields: Query id, Subject id, % identity, alignment length, mismatches, gap openings, q. start, q. end, s. start, s. end, e-value, bit score
                  1EB_RP_001_2009-03-27_0116=1EB_RP_001_A07_26MAR2009_032.seq	sp|Q6IBW4|CNDH2_HUMAN	34.69	49	32	0	360	214	258	306	1.0	32.7
                  1EB_RP_001_2009-03-27_0116=1EB_RP_001_A07_26MAR2009_032.seq	sp|Q5T655|CC147_HUMAN	20.73	82	65	0	260	15	39	120	1.3	32.3
                  1EB_RP_001_2009-03-27_0116=1EB_RP_001_A07_26MAR2009_032.seq	sp|Q9BRQ6|CHCH6_HUMAN	29.87	77	43	2	420	223	63	139	2.9	31.2
                  1EB_RP_001_2009-03-27_0116=1EB_RP_001_A07_26MAR2009_032.seq	sp|Q9Y3L3|3BP1_HUMAN	35.48	62	39	2	414	232	7	62	2.9	31.2
                  1EB_RP_001_2009-03-27_0116=1EB_RP_001_A07_26MAR2009_032.seq	sp|Q9LEM8|NAC2_CHLRE	38.46	39	24	0	387	271	1313	1351	3.8	30.8
                  1EB_RP_001_2009-03-27_0116=1EB_RP_001_A07_26MAR2009_032.seq	sp|P33424|POLN_HEVPA	39.34	61	37	2	423	241	1034	1090	5.0	30.4
                  1EB_RP_001_2009-03-27_0116=1EB_RP_001_A07_26MAR2009_032.seq	sp|Q81862|POLN_HEVCH	39.34	61	37	2	423	241	1034	1090	5.0	30.4
                  1EB_RP_001_2009-03-27_0116=1EB_RP_001_A07_26MAR2009_032.seq	sp|Q9UKP4|ATS7_HUMAN	39.47	38	23	0	423	310	1022	1059	5.0	30.4
                  1EB_RP_001_2009-03-27_0116=1EB_RP_001_A07_26MAR2009_032.seq	sp|Q2PC93|SSPO_CHICK	39.47	38	22	1	286	176	4073	4110	8.4	29.6
                  Now,
                  1) how can I add the annotation to the end for each subject entry,
                  2) how to reformat the subject entries as html link to NCBI if not using 1)?
                  Thanks!

                  Comment

                  Latest Articles

                  Collapse

                  • seqadmin
                    Latest Developments in Precision Medicine
                    by seqadmin



                    Technological advances have led to drastic improvements in the field of precision medicine, enabling more personalized approaches to treatment. This article explores four leading groups that are overcoming many of the challenges of genomic profiling and precision medicine through their innovative platforms and technologies.

                    Somatic Genomics
                    “We have such a tremendous amount of genetic diversity that exists within each of us, and not just between us as individuals,”...
                    05-24-2024, 01:16 PM
                  • seqadmin
                    Recent Advances in Sequencing Analysis Tools
                    by seqadmin


                    The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
                    05-06-2024, 07:48 AM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by seqadmin, 05-24-2024, 07:15 AM
                  0 responses
                  195 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 05-23-2024, 10:28 AM
                  0 responses
                  217 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 05-23-2024, 07:35 AM
                  0 responses
                  218 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 05-22-2024, 02:06 PM
                  0 responses
                  12 views
                  0 likes
                  Last Post seqadmin  
                  Working...
                  X