Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Blast2GO (b2g4pipe) output format

    Hello everyone,

    I'm using the command line version of Blast2GO and I was hoping that someone could help me with the output format.

    In the GUI version of Blast2GO you can get your output in a number of different formats (Genespring, etc.), but b2g4pipe seems to output the .annot format exclusively.

    Now I know I can just open my output file in the GUI version and save it in another format, but if I have hundreds of small annotations that I need in Genespring format so that method would be a bit of a pain. So my questions are:

    1) Is there a way to get b2g4pipe to output Genespring format annotations (I'm told that there isn't in the current version but I'd be happy to discover this was wrong)?
    2) Assuming there isn't, does anyone know of a command line script to convert .annot format to Genespring format?

    Thanks so much for your help

  • #2
    I think b2g4pipe can produce a normal Blast2GO .dat file, but they warn against this as the files are much larger. You could try doing that, then using the Blast2GO GUI to convert this to their "Genespring format".

    Comment


    • #3
      I often output in *.dat format but, yes, after a certain point (>200,000 sequences?) this bogs down.

      I am starting to turn away from B2Go. The single CPU mode of b2g4pipe is limiting and is often the slow part of the overall pipeline.

      Comment


      • #4
        Originally posted by westerman View Post
        I often output in *.dat format but, yes, after a certain point (>200,000 sequences?) this bogs down.

        I am starting to turn away from B2Go. The single CPU mode of b2g4pipe is limiting and is often the slow part of the overall pipeline.
        Interesting. Can you recommend a good alternative?

        Comment


        • #5
          Originally posted by westerman View Post
          I am starting to turn away from B2Go. The single CPU mode of b2g4pipe is limiting and is often the slow part of the overall pipeline.
          I don't recall finding b2g4pipe CPU limited - rather it was the (local) database queries. Maybe you've got a quicker local BLast2Go database server than us? In any case, for us b2g4pipe was much quicker than running the BLAST searches against the NR database (or InterPro scan).

          Comment


          • #6
            I've been trying to track down the slowdown for a couple of months although not very extensively. Usually I just want the results and generally do not have the time/resources to do much exploration into performance concerns. Note that I am talking about very large projects -- hundreds of thousands of contigs -- and generating a DAT file. Thus it becomes unwieldy to run test scenarios.

            b2g4pipe, unless I am mistaken, is a single-CPU program. Thus it will be processing one contig at a time. It will have to process all contigs before it generates a file. My sysadmin swears that our local database server is not overloaded. And indeed I can run multiple instances of b2g4pipe without any complaints on his end. Thus I suspect b2g4pipe, either in its code or in how it handles network traffic, to be the slow part.

            Since I can do them in parallel, I do the blast searches outside of b2g4pipe and then feed the blast-xml file into b2g4pipe. My blast searches take less time than b2g4pipe. I do not do interpro scan on a regular basic. That part seems to be even more slow.

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Best Practices for Single-Cell Sequencing Analysis
              by seqadmin



              While isolating and preparing single cells for sequencing was historically the bottleneck, recent technological advancements have shifted the challenge to data analysis. This highlights the rapidly evolving nature of single-cell sequencing. The inherent complexity of single-cell analysis has intensified with the surge in data volume and the incorporation of diverse and more complex datasets. This article explores the challenges in analysis, examines common pitfalls, offers...
              Yesterday, 07:15 AM
            • seqadmin
              Latest Developments in Precision Medicine
              by seqadmin



              Technological advances have led to drastic improvements in the field of precision medicine, enabling more personalized approaches to treatment. This article explores four leading groups that are overcoming many of the challenges of genomic profiling and precision medicine through their innovative platforms and technologies.

              Somatic Genomics
              “We have such a tremendous amount of genetic diversity that exists within each of us, and not just between us as individuals,”...
              05-24-2024, 01:16 PM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, Today, 06:58 AM
            0 responses
            11 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, Yesterday, 08:18 AM
            0 responses
            17 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, Yesterday, 08:04 AM
            0 responses
            16 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 06-03-2024, 06:55 AM
            0 responses
            13 views
            0 likes
            Last Post seqadmin  
            Working...
            X