New Backup Software Improves Processing, Reliability at Data Management Facility

 
Published: 31 May 2004

Real-time data from all three of the ARM Climate Research Facility sites (North Slope of Alaska, Southern Great Plains, and Tropical Western Pacific) are collected and processed at the ARM Climate Research Facility Data Management Facility (DMF) each day. Processing involves the application of algorithms for performing simple averaging routines, qualitative comparisons, or more complicated experimental calculations. With continual advances in computer technology, keeping up with the volume and pace of incoming data is a daunting challenge. And because the remote sites do not provide backups, reliable backups of these data at the DMF are critical. In addition, significant numbers of value-added datasets are constantly in development and represent significant scientific investment. The reliable backup of these sometimes large datasets is as important as the original data. In May, significant progress was made on upgrades to the data backup software in the DMF.

After a review of the requirements and design, NetVault backup software from BakBoneTM was selected to manage backups in the DMF, and the SpectraLogic T120 SAIT tape library was selected to soon replace the current library. Immediately upon upgrading to the NetVault software using the existing tape library, DMF staff realized a ten-fold increase in performance, and can now reliably maintain backups of 1 terabyte of data. When the T120 tape library is deployed, further performance increases are expected and will provide an online capacity of more than 15 terabytes.

In a program like ARM that relies on continuous data streams for long-term comparisons and analyses, the prospect of losing even one day worth of data is a bleak thought. The sleek new T120 tape library combined with the NetVault software is estimated to provide at least five years of scalability and performance, with a maximum capacity exceeding 60 terabytes of storage. The DMF now has the improved and much-needed capability to provide regular, reliable backups until the data are sent to the ARM Archive for distribution to the science community.