Originally posted by mebes
View Post
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
-
We will use external HDDs for now because we have one machine (Illumina HiSeq 1500) and we are actually building our bioinformatics department now so any advice from more experienced scientists in the field is always welcome.
How did you go about purchasing your workstations?
What companies should we come in contact with?
Are you happy with yours?
Thanks in advance!
Leave a comment:
-
Originally posted by gringer View Post
As the blog post said the title was a "tongue in cheek attempt" to get one's attention.
I would not recommend building something like this unless you have access to a proper server room infrastructure.
Originally posted by gringer View PostIt's a good idea to consider whether the ongoing cost of high-performance storage is worth it. I prefer the idea of bioinformatics computers as another piece of laboratory equipment. You shouldn't expect them to be working every second of the day, and you should be prepared for failure (e.g. repeating experiments if there's a failure before you can submit your read data to SRA).
If you only have couple of machines (and don't run them round the clock) then you are absolutely right.
Leave a comment:
-
Originally posted by GenoMax View Post
This is cheap storage, not fast storage and certainly not highly-available storage. It carries a far higher operational and administrative burden than storage arrays traditionally sold into the enterprise.
After the last blog post explaining all of the sensible reasons for why you should never build a backblaze pod it's time now to talk about why we did decide to build one.
We are using the backblaze pod plus NAS appliance software from www.openfiler.com to build a “last resort” storage pool for scientific data that is not valuable enough to spend lots of money on a more traditional storage solution yet large enough in terabyte terms to represent a significant time-risk should an event occur that would require all this data be re-downloaded again via the internet.
We see this $12,000 appliance as a simple hedge against interrupting ongoing research activities. Totally worth it.Last edited by gringer; 11-28-2013, 12:31 PM.
Leave a comment:
-
Unless one is working with irreplaceable samples it probably does not make sense to store data long term for individual labs (for core facilities it is a business decision based on the SLA). You can submit a copy to SRA/EBI and have them store it long term.
Mebes: I have assumed so far that you are an individual lab looking to purchase this hardware. If you are a core then you should never put all your eggs in one basket. You would want to have identical systems as backup if you expect to process tens of flowcells a month.Last edited by GenoMax; 11-28-2013, 10:12 AM.
Leave a comment:
-
If you want a personal option for storage, you could have a go at the backblaze 3.0 pod:
Get all of the latest cloud storage news and insights from Backblaze - the leading independent cloud storage provider.
You can order the 4U pod from 45Drives, to which you then add your own SATA hard drives:
If you want the most reliable system, the hot-swappable drives can be set up as 3 banks of RAID6 (dual-parity) combined into either a single logical volume, or three separate volumes. With 45 4TB drives, that will give you 144TB of storage space:
disclaimer: I have not yet convinced any of my clients to install one of these systems at their workplace, I just really like the look of the system.Last edited by gringer; 11-27-2013, 10:44 AM.
Leave a comment:
-
The 6TB of disk space will do for two or three full HiSeq runs; after that you will be out of space. Given the price of 512 GB memory your system is already expensive. I would add in more disk space. At least 2 TB per run. Also where and how are you planning to backup your data? A good backup can mitigate HDD concerns.
Leave a comment:
-
Thank you for answering and yes i was referring to raw data folder sizes.
Any other suggestions or warnings from anyone?
Thanks in advance!
Leave a comment:
-
Since mebes referred to converting bcl files to fastq it is perhaps safe to infer that mebes is referring to raw data folder sizes.
We mostly do rapid runs on multiple HiSeq 2500's but on HiSeq 2000, 2 x 100 bp runs generate no more than 600G of data (no images, before conversion to fastq) per flowcell.
Since 1500 runs a single flowcell, 800 G may be possible for a 2 x 150 bp run. Once mebes responds we will know.Last edited by GenoMax; 11-26-2013, 09:45 AM.
Leave a comment:
-
HiSeq will produce raw FASTQ files in the range of 100-800GB (total for one lane / run). The image files are in the order of 1-5TB.
On the computer selection aspect, you'll get a better price/performance ratio out of choosing a computer with lots of processing cores, but ignoring raw speed. Two 4-core processors will be cheaper than one 8-core processor, and a 3.2GHz processor will be cheaper than a 4.7GHz processor.Last edited by gringer; 11-26-2013, 09:19 AM.
Leave a comment:
-
Go for it. For the listed applications this should be fully adequate.
Size of the files seem a little odd. Are you saving images from your runs (that is the likely explanation)? There is really no need to do that any longer. You will be saving a ton of space by not doing that (unless a HiSeq 1500 can't do real time image analysis, I am not familiar with a 1500).
Leave a comment:
-
Buying multicore pc for RNASeq de novo assembly
Hello everyone,
i would like your opinion on the hardware specs of a multicore pc that my lab wants to buy for analysis of RNASeq data.
The analysis will include conversion from .bcl to fastq. and then to .bed files from human organisms. Our data was produced from Illumina 1500 HiSeq and the size of the files for this run ranges from 100GB to 800GB.
I have searched other topics and articles and my conclusion is for
16 cores with high MhHz,
512GB,
6TB of HDD.
Linux for software,specifically CentOs.
Thanks!Tags: None
Latest Articles
Collapse
-
by seqadmin
Innovations in next-generation sequencing technologies and techniques are driving more precise and comprehensive exploration of complex biological systems. Current advancements include improved accessibility for long-read sequencing and significant progress in single-cell and 3D genomics. This article explores some of the most impactful developments in the field over the past year.
Long-Read Sequencing
Long-read sequencing has seen remarkable advancements,...-
Channel: Articles
12-02-2024, 01:49 PM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, Today, 07:59 AM
|
0 responses
7 views
0 likes
|
Last Post
by seqadmin
Today, 07:59 AM
|
||
Newborn Genomic Screening Shows Promise in Reducing Infant Mortality and Hospitalization
by seqadmin
Started by seqadmin, Yesterday, 08:22 AM
|
0 responses
9 views
0 likes
|
Last Post
by seqadmin
Yesterday, 08:22 AM
|
||
Started by seqadmin, 12-02-2024, 09:29 AM
|
0 responses
171 views
0 likes
|
Last Post
by seqadmin
12-02-2024, 09:29 AM
|
||
Started by seqadmin, 12-02-2024, 09:06 AM
|
0 responses
61 views
0 likes
|
Last Post
by seqadmin
12-02-2024, 09:06 AM
|
Leave a comment: