High Availability Storage

During 2009 and 2010, HPC Center staff designed, built, and tested a new infrastructure to provide a highly-available storage system that meets the high-performance requirements of the HPC Center cluster and user applications.

The architecture chosen by the team of Charles Taylor, Craig Prescott, and Jon Akers provides full redundancy to ensure that no component is a single point of failure. The system was completed a little over a year ago and has been in production ever since.

The storage is used to support a high-performance, parallel file system with the size of 230 TB to over 7,000 compute cores of the HPC Center cluster. The file system type is Lustre.

The HPC Center staff presented the architecture of the system at the Lustre User Group conference April 2012 in Austin, TX and at SuperComputing 2012 November 2012 in Salt Lake City, UT. The details of the design are explained in theĀ presentation given at these conferences.