Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • We have 8500USD. Need a Blast-capable system

    Hey guys,

    So we're a startup bioinformatics lab. We have a "workstation" running 2x E5-2650 v2s and 128GB ram. So essentially it's a down-stream analysis machine. We're working with transcriptomes roughly around 30-70k genes in size.

    Right now we have been doing as much on the cloud as we can. Locally we have only cleaned our raw reads and assembled on iPlant or Galaxy public servers. We do want to add a capability to do the assembly locally.

    Right now our limiting step is the Blastx process. Splitting the assembled fasta file into 10 smaller fragments, we can run all 10 at once and have the completed sequence finished in roughly 15 days.

    Now:

    We have obtained around 8500 dollars for an increase to our current computational power. What I'm thinking is obtaining some off-lease, refurbished blade systems from eBay or another retailer like Xbyte. If we do go this route, a quick eBay search brought me to setups similar to these few items:






    In your professional opinion, how much of an upgrade would these systems be if we build a cluster using this setup? Would we be able to significantly cut down on our Blast time, if we can dedicate more cores/break-up the fasta file into smaller segments?

    Thank you for your feedback!

  • #2
    Bump. Can anyone provide their input on this? If anyone uses a cluster for blast, can you provide us with the physical specs of the cluster? Number of processors, processor model, ram, etc.

    Thank you!

    Comment


    • #3
      Clusters can quickly get into a non-zero amount of administrative time compared to a large memory workstation (also non-zero, but much more manageable...).

      Have you considered applying for free cycles on the XSEDE systems? These are NSF funded resources provided to researchers (non-profit) requiring computational cycles that don't have them available at their institution.

      Comment


      • #4
        Thing to keep in mind is these blade servers share a NIC across the set of blades in a box. So if you are planning to build a cluster then you would need to take network bandwidth into consideration. Unless you have high performance storage doing nfs disk mounts across a cluster will lead to degraded performance (you could read BLAST db into memory if you have enough of it).

        10G ethernet is not inexpensive and you would need 10G capable switch on the backend (unless you are only going to get one of these blade enclosures) to tie the blade enclosures together. Heat/noise would all be considerations if you do not have access to a dedicated server room to host these in. Managing off warranty hardware is tricky. Parts would not be cheap and may not be available in some cases. So consider that before jumping in.

        It sounds like you are looking to do brute force parallelization (i.e. no real mpiBLAST) so that part is straightforward.

        Comment


        • #5
          Originally posted by GenoMax View Post
          Thing to keep in mind is these blade servers share a NIC across the set of blades in a box. So if you are planning to build a cluster then you would need to take network bandwidth into consideration. Unless you have high performance storage doing nfs disk mounts across a cluster will lead to degraded performance (you could read BLAST db into memory if you have enough of it).

          10G ethernet is not inexpensive and you would need 10G capable switch on the backend (unless you are only going to get one of these blade enclosures) to tie the blade enclosures together. Heat/noise would all be considerations if you do not have access to a dedicated server room to host these in. Managing off warranty hardware is tricky. Parts would not be cheap and may not be available in some cases. So consider that before jumping in.

          It sounds like you are looking to do brute force parallelization (i.e. no real mpiBLAST) so that part is straightforward.
          Yeah, we are looking into purchasing 10G cable and a switch. We do have a lab room right now which houses an MRI and confocal equipment. The room is cooled with dedicated 220v power + backup generators. There are a few computers/laptops in that room right now so the MRI shouldn't cause any interference with the equipment.

          For the blast DB setup, we are thinking of creating a DB on each of the blade's HDDs, so we don't run into any issues. And yes, we're trying to accomplish brute-force parallelization.

          jpummil, Thanks for that. I'm applying right now for some computational time.
          Last edited by Blaze9; 09-08-2014, 10:14 AM.

          Comment


          • #6
            You would want to check if the 10G is fiber or copper on these enclosures.

            Comment


            • #7
              Originally posted by Blaze9 View Post

              jpummil, Thanks for that. I'm applying right now for some computational time.
              Just request a start-up on the initial account...the request period for a substantially larger allocation is coming between Sept 15-Oct15 for a block of time that would begin on Jan 1 2015.

              If you need assistance, you might well have an XSEDE Campus Champion on your campus who's task is to assist new XSEDE users thru the process of selecting the right machine(s), writing a proposal for larger block of time, etc...

              List of CC's here:

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Essential Discoveries and Tools in Epitranscriptomics
                by seqadmin




                The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                04-22-2024, 07:01 AM
              • seqadmin
                Current Approaches to Protein Sequencing
                by seqadmin


                Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                04-04-2024, 04:25 PM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, 04-25-2024, 11:49 AM
              0 responses
              19 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-24-2024, 08:47 AM
              0 responses
              18 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-11-2024, 12:08 PM
              0 responses
              62 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 10:19 PM
              0 responses
              60 views
              0 likes
              Last Post seqadmin  
              Working...
              X