Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Importance of number of cores when buying big RAM server

    Hello,

    We are planning to add a 1 TB RAM node to the cluster at my institution (currently 10 nodes 64-256 GB RAM each). We are going this high because we will be dealing with de novo assembly of plant large genomes.

    We just got a quote for 1 TB of RAM and 48 cores, which is a bit above our budget... but the company says that we can considerably reduce the cost if we keep the same amount of RAM but with fewer cores.

    My question what is the role of the number of cores during bioinformatics analyses (SOAPdenovo, Velvet, Trinity, etc) and how reducing the number of cores could affect them.

    Cheers,

  • #2
    Forgot to say, we could reduce de number of cores to say 10, 20 or 30.

    Comment


    • #3
      Most of the times you will find that processes are going to be I/O bound i.e. cores would be waiting for data to arrive. There is only so much bandwidth PCI-E/SAS will provide (even if you are using SSD's). If you are saving a significant amount of money by dropping number of cores then you could put some of that towards more RAM for some of the other nodes so you have one or two with 512G (something in between the 256G and 1TB).

      Comment


      • #4
        Agree with above - the only time I find a ton of cores really useful is stuff like permutation tests.

        Related to the OP's question, does anyone have experience with Intel (e.g. Xeon) vs AMD CPUs? I'm curious if anyone in the genomics sphere strongly prefers one over the other, or if it matters much at all...

        Comment


        • #5
          While Intel lollygagged with their Itanium architecture back in 2003, AMD introduced the x86-64 architecture (a 64-bit version of the x86 instruction set).
          The AMDs were way ahead of intel 10 years ago
          Intel quickly responded and now make better "brute strength" high end chips. AMD has resumed their niche as the "value" competitor.

          Benchmarks here:
          PassMark Software - CPU Benchmarks - Over 1 million CPUs and 1,000 models benchmarked and compared in graph form, updated daily!

          The best intel is >2X the best AMD.

          Comment


          • #6
            I'd say that, unfortunately, Intel is really the only player in bioinformatics right now, for high-memory nodes. When you have a ton of memory, the CPUs become a minority of the cost and you really want high single-threaded performance to use the memory-time as efficiently as possible. For low-memory high-throughput nodes with well-threaded applications, the equation is a bit different, but Intel still gives more performance per watt.

            As for the number of cores - 30 sounds good; you can normally save a lot of money by going from 4 sockets to 2 sockets (if you can get a 2-socket system that supports 1TB) or by dropping the frequency slightly. Some assemblers (like SPAdes) are not threaded very well, but some (like Megahit) are. And a lot of ancillary high-memory programs that you might run - error-correction, for example, or the mapping phase of scaffolding - are very well threaded and scale linearly with number of cores. So, if you invest in 1TB of memory, I would say don't cripple it with only 10 cores, because having 30 cores - if you run well-threaded programs - would effectively be as good as having 3 nodes with 1TB and 10 cores.

            Comment


            • #7
              Thanks everybody for your useful replies, this is the kind of information I wasn't being able to find elsewhere.

              Brian, does it sounds too bad if we drop it to 20 cores and 768 RAM? This would allow us to upgrade one of our other small nodes to 256 or hopefully 512 with 10 cores. If all the stuff we run were well threaded I wouldn't have this question, but this would be used for several groups with different needs, thus maybe 2 good nodes would work better than a single super.

              Your advice is very welcomed.

              Comment


              • #8
                768 sounds like a strange number which indicates 3 memory channels (modern Intel CPUs should have 4) or uneven loading, which might lead to performance issues, so you might want to check and make sure that there are no detriments associated with that configuration.

                I'd rather have 16 cores and 1TB than 20 cores and 768, if you can do that. Also bear in mind that there mpi-enabled assemblers that can use memory across nodes. But getting back to your question... what you can ultimately accomplish is often dictated by the memory of your single highest-memory node, but I can't really predict whether 768+512 or 1TB alone would be better for your workloads.

                Comment


                • #9
                  Thanks Brian, performance issues due to this configuration is a good point, we will check that.

                  One last question. One of the differences among the 48 vs 20 cores quotes we are getting is on the size of the RAM modules. My field is genetics so I don't really know the guts of how this works, so: to reach say 512 RAM does it makes a difference to have 16 modules 32 GB each than 32 modules 16 GB each?

                  Many thanks

                  Comment


                  • #10
                    Originally posted by AMstt View Post
                    Thanks Brian, performance issues due to this configuration is a good point, we will check that.

                    One last question. One of the differences among the 48 vs 20 cores quotes we are getting is on the size of the RAM modules. My field is genetics so I don't really know the guts of how this works, so: to reach say 512 RAM does it makes a difference to have 16 modules 32 GB each than 32 modules 16 GB each?

                    Many thanks
                    If you are anticipating upgrading the memory in future then leave some slots free otherwise you can fill them up.

                    Comment

                    Latest Articles

                    Collapse

                    • seqadmin
                      Essential Discoveries and Tools in Epitranscriptomics
                      by seqadmin




                      The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                      04-22-2024, 07:01 AM
                    • seqadmin
                      Current Approaches to Protein Sequencing
                      by seqadmin


                      Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                      04-04-2024, 04:25 PM

                    ad_right_rmr

                    Collapse

                    News

                    Collapse

                    Topics Statistics Last Post
                    Started by seqadmin, 04-25-2024, 11:49 AM
                    0 responses
                    19 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 04-24-2024, 08:47 AM
                    0 responses
                    20 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 04-11-2024, 12:08 PM
                    0 responses
                    62 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 04-10-2024, 10:19 PM
                    0 responses
                    61 views
                    0 likes
                    Last Post seqadmin  
                    Working...
                    X