First thing I'd do is profile the performance characteristics of the app. Is it read or write intensive? Contiguous or random access?
A large cache (like on a NetApp box) can make up for a variety of sins if it's read intensive.
-----Original Message----- From: owner-toasters@mathworks.com [mailto:owner-toasters@mathworks.com] On Behalf Of Raj Patel Sent: Friday, March 11, 2011 1:43 PM To: toasters@mathworks.com Subject: SAN for SAS
Hi,
A bit of a generic SAN question (not necessarily NetApp specific).
I've got a team of 40 people who use a statistical analysis package (SAS) to crunch massive time-series data sets.
They claim their biggest gripe is disk contention - not necessarily one person using the same data but 40. So they process these data-sets locally on high-spec PC's with several disks (one for OS, one for scratch, one for reads and one for writes).
I think they'd be much better off utilising shared storage (ie a SAN) in a datacenter so at least the workloads are spread across multiple spindles and they only need to copy or replicate the data within the datacenter rather than schlep it up and down the WAN which is what they currently do to get it to their distributed team PC's.
Are there any useful guides or comparisons for best practise in designing HPC environments on shared infrastructure ?
Other than knowing what SAS does I'm not sure on its HPC capabilities (ie distributed computing, clustering etc) so I'll need to follow that angle up too.
Thanks in advance, Raj.
Please be advised that this email may contain confidential information. If you are not the intended recipient, please notify us by email by replying to the sender and delete this message. The sender disclaims that the content of this email constitutes an offer to enter into, or the acceptance of, any agreement; provided that the foregoing does not invalidate the binding effect of any digital or other electronic reproduction of a manual signature that is included in any attachment.