Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • Romualdo
    Junior Member
    • Nov 2014
    • 6

    Blast2GO Beginner's Question

    Hi everybody, I'm new in NGS. I used the flow STAR/Cufflinks/Cuffcompare and now i need annotate my transcripts, and I decided to use Blast2GO because it seems more intuitive.
    But not so totaly intuitive for a total beginner like me, and I don't know what fashion of blast I should perform. The basic version of program allow me to use this methods:

    QBlast@NCBI. NCBI o ers a public service that allows searching molecular sequence
    databases with the BLAST algorithm. The main advantages of making use of this service
    are its versatility and that no database maintenance is required. Therefore by selecting
    this option at Blast2GO no additional installations have to be done.

    Remote BLAST. Blast2GO will download the latest BLAST+ executable form NCBI and
    will use it to query NR or other databases remotely.

    Local BLAST against own database. It is possible to use BLAST+ excuteble to query a
    local/own database.

    WWW-BLAST. Alternatively, BLAST can be done locally against a custom database. For
    this, you need to place a copy of your FASTA formatted custom DB plus a WWW-BLAST
    installation on a local BLAST server and indicate Blast2GO their location.

    My fasta have 16450 sequences, and I want to use database NCBI NR Full, I have a i7 3770 8gb ram computer.

    So the question is: with this resources what is the most safe and easy way to Blast ?
  • cement_head
    Senior Member
    • Mar 2012
    • 264

    #2
    Do this locally. Download the nr database and use BLASTX against it.

    BUT, it is MUCH faster if you use mpiBLAST on a cluster.

    The command for mpiBLAST (with the correct flags for B2GO) is something like:

    mpiblast -p blastx -d nr -i input.fa -v 20 -b 20 -I T -e 0.001 -m 7 -o output.xml

    Comment

    • Romualdo
      Junior Member
      • Nov 2014
      • 6

      #3
      Originally posted by cement_head View Post
      Do this locally. Download the nr database and use BLASTX against it.

      BUT, it is MUCH faster if you use mpiBLAST on a cluster.

      The command for mpiBLAST (with the correct flags for B2GO) is something like:

      mpiblast -p blastx -d nr -i input.fa -v 20 -b 20 -I T -e 0.001 -m 7 -o output.xml
      mpiBLAST is much faster even in this pc alone ?

      Comment

      • sarvidsson
        Senior Member
        • Jan 2015
        • 137

        #4
        No, MPI-BLAST only makes sense on a cluster (if you have access to one). Local multithreading BLAST will be faster than local MPI multiprocessing, and more memory-efficient.

        You might run into trouble with only 8 GB RAM if you want to BLAST the complete nr locally, give it a try but you may get out-of-memory problems. i7-3770 would be 4 cores with hyperthreading, so be prepared for several days of BLASTing...

        Comment

        • cement_head
          Senior Member
          • Mar 2012
          • 264

          #5
          Originally posted by sarvidsson View Post
          No, MPI-BLAST only makes sense on a cluster (if you have access to one). Local multithreading BLAST will be faster than local MPI multiprocessing, and more memory-efficient.

          You might run into trouble with only 8 GB RAM if you want to BLAST the complete nr locally, give it a try but you may get out-of-memory problems. i7-3770 would be 4 cores with hyperthreading, so be prepared for several days of BLASTing...
          Yes, correct - only if you have (access to) a cluster.

          Comment

          • Will Nelson
            Member
            • Nov 2010
            • 16

            #6
            Blasting against nr is not easy. Even with 4 threads, to blast 16,000 sequences will take around 4,000 minutes, or 66 days. Performing GO assignment is also not easy. Importing 16,000 blastx results into the free version of Blast2GO, and then doing GO assignments, will take many days.

            Comment

            • Will Nelson
              Member
              • Nov 2010
              • 16

              #7
              Sorry, my math was all wrong on my last post. Let me try again.

              In reality, it takes at least 5 minutes for blastx to align one transcript to nr. For 16,000 sequences, with 4 threads, that is (16,000x5)/4 = 20,000 minutes, or 13.8 days. Then if you want to get GOs by importing into the blast2GO free version, that takes several more days at least.

              Comment

              • GenoMax
                Senior Member
                • Feb 2008
                • 7142

                #8
                @Will Nelson: curious if you have a roughly equivalent spec computer as the OP. Did you actually time a search?

                Comment

                • Romualdo
                  Junior Member
                  • Nov 2014
                  • 6

                  #9
                  Well I'm having a hard time with this, Blast2GO-basic remotely blasting just take too long for each sequence, so I got more speed trying it locally with Blast+ Blastx and importing the output xml on Blast2GO for subsequently steps. But Will Nelson is right, is impraticable do this on this computer. Our lab is about to buy a server with 128gb RAM, until then I wanna be more experienced with this, so I made a 100 sequences sample.

                  So I got this repeatedly when running the Blastx:

                  CFastaReader: Bad gap size at line ***
                  CFastaReader: Problem parsing gap mods at line ***

                  When "***" are line numbers, this lines matches with sequence id lines, that use this format:

                  >?_GroupUn999_2_939_+

                  What in this format is generating that error ?

                  Comment

                  • GenoMax
                    Senior Member
                    • Feb 2008
                    • 7142

                    #10
                    What format are your sequences in?

                    That error seems to indicate that there may be a problem with your fasta file. Can you try to replace the "?_" at the beginning of the header? Looks like that may be causing a problem.

                    Comment

                    • cement_head
                      Senior Member
                      • Mar 2012
                      • 264

                      #11
                      Originally posted by Romualdo View Post
                      Well I'm having a hard time with this, Blast2GO-basic remotely blasting just take too long for each sequence, so I got more speed trying it locally with Blast+ Blastx and importing the output xml on Blast2GO for subsequently steps. But Will Nelson is right, is impraticable do this on this computer. Our lab is about to buy a server with 128gb RAM, until then I wanna be more experienced with this, so I made a 100 sequences sample.

                      So I got this repeatedly when running the Blastx:

                      CFastaReader: Bad gap size at line ***
                      CFastaReader: Problem parsing gap mods at line ***

                      When "***" are line numbers, this lines matches with sequence id lines, that use this format:

                      >?_GroupUn999_2_939_+

                      What in this format is generating that error ?
                      Make absolutely sure that you buy ECC RAM. Anything less and you will have major problems

                      Comment

                      Latest Articles

                      Collapse

                      ad_right_rmr

                      Collapse

                      News

                      Collapse

                      Topics Statistics Last Post
                      Started by SEQadmin2, 06-05-2026, 10:09 AM
                      0 responses
                      15 views
                      0 reactions
                      Last Post SEQadmin2  
                      Started by SEQadmin2, 06-04-2026, 08:59 AM
                      0 responses
                      33 views
                      0 reactions
                      Last Post SEQadmin2  
                      Started by SEQadmin2, 06-02-2026, 12:03 PM
                      0 responses
                      35 views
                      0 reactions
                      Last Post SEQadmin2  
                      Started by SEQadmin2, 06-02-2026, 11:40 AM
                      0 responses
                      23 views
                      0 reactions
                      Last Post SEQadmin2  
                      Working...