I'm curious as to how many folks out there have used the Amazon EC2 compute services to run 2nd gen sequencing analyses "in the cloud" and particularly which particular compute offerings you chose.
For those who haven't looked at this, Amazon has a variety of compute schemes which can be rented by the hour which differ in the amount of RAM, local storage & compute power. I'm trying to get a handle on how to map these onto real world compute problems -- working with vertebrate genomes using BWA, Bowtie, BFAST, TopHat, dindel, etc.
A quick table I've assembled from Amazon's own specs;
I've left out the "Micro" instance as I'm still trying to figure out what it does, and have also left out the 32-bit options.
EC2 here is their attempt to estimate performance; apparently 1 EC2 is equivalent to a 1.7Ghz Xenon processor with generally 2 cores per processor.
Pricing is on a different matrix I'm still digesting (there are 3 pricing models for the above classes of machines), but basically within a family each supersizing costs you double -- extra large is twice large and quadruple is another doubling of that.
So, which is a good choice? And in running multithreaded applications, would I go with 1 thread per core or 1 per EC2?
For those who haven't looked at this, Amazon has a variety of compute schemes which can be rented by the hour which differ in the amount of RAM, local storage & compute power. I'm trying to get a handle on how to map these onto real world compute problems -- working with vertebrate genomes using BWA, Bowtie, BFAST, TopHat, dindel, etc.
A quick table I've assembled from Amazon's own specs;
Code:
Family Name Gb RAM EC2 Gb Disk Bits I/O Standard Large 7.500 4.0 850 64 high Standard Extra Large 15.000 8.0 1,690 64 high High-Memory Extra Large 17.100 6.5 420 64 moderate High-Memory Double Extra Large 34.200 13.0 850 64 high High-Memory Quadruple Extra Large 68.400 26.0 1,690 64 high High-CPU Extra Large 7.000 20.0 1,690 64 high Cluster Compute Quadruple Extra Large 23.000 33.5 1,690 64 very high
EC2 here is their attempt to estimate performance; apparently 1 EC2 is equivalent to a 1.7Ghz Xenon processor with generally 2 cores per processor.
Pricing is on a different matrix I'm still digesting (there are 3 pricing models for the above classes of machines), but basically within a family each supersizing costs you double -- extra large is twice large and quadruple is another doubling of that.
So, which is a good choice? And in running multithreaded applications, would I go with 1 thread per core or 1 per EC2?