Seqanswers Leaderboard Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • GenoMax
    replied
    Originally posted by vinaydu View Post
    @Genomax

    Thanks for replying. Its a single server and no nodes have been made (I guess it should have installed with SGE (Sun Grid Engine)).

    I am using the legacy version blastall and using blastp program.

    Now if you please can tell me how to make virtual disk? Or any link or nice tutorial to do so..


    Thanks
    If this is a single server you are not going to gain much by using SGE. There are no physical nodes to distribute the jobs across.

    You are using legacy blastall but is that from a new release of blast package (currently at v.2.2.29) i.e. being run via legacy_blast.pl?

    Exact process for creating a RAM disk using tmpfs will vary with the specific flavor of unix. Here is an example for Fedora 17: http://superuser.com/questions/51600...disk-fedora-17

    Leave a comment:


  • vinaydu
    replied
    @Genomax

    Thanks for replying. Its a single server and no nodes have been made (I guess it should have installed with SGE (Sun Grid Engine)).

    I am using the legacy version blastall and using blastp program.

    Now if you please can tell me how to make virtual disk? Or any link or nice tutorial to do so..


    Thanks

    Leave a comment:


  • GenoMax
    replied
    Any well written program will use memory as needed. Having extra memory does not help per se (unless you start using a part of the RAM as a virtual drive to hold data, see below).

    You never made it clear if this is a cluster or a single server. If you really have 768 GB of RAM on a single server and you are the only user actively using this system you could think about creating a virtual disk (if you do not have admin access you will need to ask the admins to see if they would be willing) and copy your db index into that space for the fastest possible access.

    If you are using "blastall" that seems to indicate that you are not using a newer version of blast suite. Is that the case? Depending on which blast search (n/p) you are running you should optimize search parameters according to what it is you are trying to get from this search. Blast command line manual is a must read reference: http://www.ncbi.nlm.nih.gov/books/NBK1763/

    Leave a comment:


  • vinaydu
    replied
    I am using 48 cpus using -a option for blast all. So you are indicating that this is fine?

    Leave a comment:


  • sphil
    replied
    Why should it use all memory? It can only handle as many reads at the same time as you have CPU-kernels on hand. This should then somehow correlate with the memory used....

    Leave a comment:


  • vinaydu
    replied
    Thanks sphil for showing the interest.

    It is true that the output files generated are of the size of ~3.5 Gbs.

    But I am more concerned about the physical memory utilization. BLAST is running fine. Only thing is that this process is not utilizing all the memory.

    Leave a comment:


  • sphil
    replied
    What comes into my mind is that every of your 10k sequences might end up aligning to a huge amount of sequences in the NR database. Therefore the output file is also very large (maybe compare...). What that means, imagine if every sequence hits in between 50 to 100 times. You end up blasting like 50k to 1000k sequences which really takes quite a bit amount of time. Try to set the "report max hits" flags...

    Leave a comment:


  • vinaydu
    replied
    Hi Sphil,

    Yeah it takes 2-3 days to BLAST a fasta file containing 10K sequences and generating xml output. I have been blasting these files for last 10-12 days.. Still not complete.

    Further my apprehension is that when the memory is available, why BLAST is not using it.

    Leave a comment:


  • sphil
    replied
    Can you be more specific on "it is slow"? Does it take days, weeks or just a few hours?

    Leave a comment:


  • vinaydu
    replied
    Hi sphil,

    I have 9706 sequences in a file.

    grep ">" fastafile | wc -l

    Actually the intial fasta file was very big and hence I need to split it in smaller sizes.
    Now even after spliting BLAST is reluctantly slow.

    Further for some of the fasta files it terminates (bash message) hence i thought I should split it. But even after spliting for some files it is terminate and BAST is very slow. I am forcing xml output.

    Any suggestions!!!!

    Leave a comment:


  • sphil
    replied
    how many sequences are in your input set?

    Leave a comment:


  • vinaydu
    replied
    @Genomax,

    I am BLASTing against nr database which presently is ~15Gb.

    I am using blastall.

    I am running it on a large server, which has this much (750Gb) physical memory.

    Please advise...

    Leave a comment:


  • GenoMax
    replied
    Originally posted by vinaydu View Post
    Please reply members. My blast job is taking very long. Please help..
    Please provide some additional information about the size of the query/database you are blasting against. What version of blast are you using? Are you running it on a cluster or a really large server (700+ G memory free seems suspect, since single servers with that much memory are not common).

    Leave a comment:


  • vinaydu
    replied
    Thanks amit. Let me try it. I will post the result...

    Leave a comment:


  • amitbik
    replied
    Try to run your blast with mpirun. In some cases it workedout for me.

    mpirun [option] command

    Leave a comment:

Latest Articles

Collapse

  • seqadmin
    Pathogen Surveillance with Advanced Genomic Tools
    by seqadmin




    The COVID-19 pandemic highlighted the need for proactive pathogen surveillance systems. As ongoing threats like avian influenza and newly emerging infections continue to pose risks, researchers are working to improve how quickly and accurately pathogens can be identified and tracked. In a recent SEQanswers webinar, two experts discussed how next-generation sequencing (NGS) and machine learning are shaping efforts to monitor viral variation and trace the origins of infectious...
    Today, 11:48 AM
  • seqadmin
    New Genomics Tools and Methods Shared at AGBT 2025
    by seqadmin


    This year’s Advances in Genome Biology and Technology (AGBT) General Meeting commemorated the 25th anniversary of the event at its original venue on Marco Island, Florida. While this year’s event didn’t include high-profile musical performances, the industry announcements and cutting-edge research still drew the attention of leading scientists.

    The Headliner
    The biggest announcement was Roche stepping back into the sequencing platform market. In the years since...
    03-03-2025, 01:39 PM
  • seqadmin
    Investigating the Gut Microbiome Through Diet and Spatial Biology
    by seqadmin




    The human gut contains trillions of microorganisms that impact digestion, immune functions, and overall health1. Despite major breakthroughs, we’re only beginning to understand the full extent of the microbiome’s influence on health and disease. Advances in next-generation sequencing and spatial biology have opened new windows into this complex environment, yet many questions remain. This article highlights two recent studies exploring how diet influences microbial...
    02-24-2025, 06:31 AM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, 03-20-2025, 05:03 AM
0 responses
26 views
0 reactions
Last Post seqadmin  
Started by seqadmin, 03-19-2025, 07:27 AM
0 responses
33 views
0 reactions
Last Post seqadmin  
Started by seqadmin, 03-18-2025, 12:50 PM
0 responses
25 views
0 reactions
Last Post seqadmin  
Started by seqadmin, 03-03-2025, 01:15 PM
0 responses
190 views
0 reactions
Last Post seqadmin  
Working...