Hi everyone!
I'm having trouble while running blast locally for a big database, and I hope someone can help me with this.
I'm running blastp through biopython NcbiblastpCommandline function and I got a Bio.Application.ApplicationError: "Command 'blastp -out ../Results/blastDBs/all_genomes_prot_db_blast_results.xml -outfmt 5 -query ../faa_files/all_genomes_prot.fasta -db ../Results/blastDBs/all_genomes_prot_db -evalue 0.001' returned non-zero exit status 137, 'Killed'".
The same code worked in a smaller database and query (25 genomes against the same 25) but in this case I want to run it to 74 genomes, what takes a lot of time (40h until the error). After searching about the error, it seems to be about the system run out of memory.
I thought it could be because blastp is not writing in the XML file as the blast is running. It only writes the results in the end, so, all the results stay allocated in memory. I'm running it in a server by a VPN connection, and this only happens there. By running the same code in my pc, I can see the size of XML file increasing while the blast is. The blast versions are different (2.2.26+ in the server and 2.2.28+ in my pc) so I updated the version in the server to 2.2.30+, but it's still doing the same. And it's not because of python or biopython, because if I run blast directly from command line, the same happens (the size of the xml file is 0 while blast is running).
Someone have a clue why it happens? Why the size of the xml file does not increase in the server but it does in my pc? Is it the cause the system going out of memory, causing the error? Does anyone knows if there's any configuration for BLAST to start writing to the output file during the execution instead of keeping the data in memory until the end?
Please help me!
I'm having trouble while running blast locally for a big database, and I hope someone can help me with this.
I'm running blastp through biopython NcbiblastpCommandline function and I got a Bio.Application.ApplicationError: "Command 'blastp -out ../Results/blastDBs/all_genomes_prot_db_blast_results.xml -outfmt 5 -query ../faa_files/all_genomes_prot.fasta -db ../Results/blastDBs/all_genomes_prot_db -evalue 0.001' returned non-zero exit status 137, 'Killed'".
The same code worked in a smaller database and query (25 genomes against the same 25) but in this case I want to run it to 74 genomes, what takes a lot of time (40h until the error). After searching about the error, it seems to be about the system run out of memory.
I thought it could be because blastp is not writing in the XML file as the blast is running. It only writes the results in the end, so, all the results stay allocated in memory. I'm running it in a server by a VPN connection, and this only happens there. By running the same code in my pc, I can see the size of XML file increasing while the blast is. The blast versions are different (2.2.26+ in the server and 2.2.28+ in my pc) so I updated the version in the server to 2.2.30+, but it's still doing the same. And it's not because of python or biopython, because if I run blast directly from command line, the same happens (the size of the xml file is 0 while blast is running).
Someone have a clue why it happens? Why the size of the xml file does not increase in the server but it does in my pc? Is it the cause the system going out of memory, causing the error? Does anyone knows if there's any configuration for BLAST to start writing to the output file during the execution instead of keeping the data in memory until the end?
Please help me!
Comment