Keeping the Data Flowing

 
Published: 9 February 2015

The ARM data system, residing at Oak Ridge National Laboratory, installed new hardware in FY2014.
The ARM data system, residing at Oak Ridge National Laboratory, installed new hardware in FY2014.
Experiences with merging of Archive and DMF systems and data flow

In late 2013, the ARM Data Center at Oak Ridge National Laboratory (ORNL) procured “next generation” computer hardware to support the ARM Facility. This consolidation of hardware enables interchangeability of devices and capacity between processes of the Data Management Facility (DMF) and ARM Data Archive—also referred to as the ARM Data Center—as the ARM data rates continue to grow over the next several years. New hardware was purchased and installed during late FY2013 and early FY2014, with several processes migrating to the new hardware during FY2014.

Expanding Horizons and Network Structures

With the goal of sharing and consolidating hardware resources between the DMF and ARM Data Archive, hardware for the DMF at Pacific Northwest National Laboratory (PNNL) in Washington state moved to ORNL in Tennessee in early 2013 to improve data flow between these segments of the ARM data system. Merging the Archive and DMF required major changes in the network connectivity between these segments. When the DMF moved from PNNL to ORNL, it was moved with a minimum of system configuration changes to limit the interruption of ARM data flow. The team developed a plan that radically consolidated and relocated services to simplify what was being moved and deployed a system to ORNL ahead of the move to support critical services that would otherwise have had a much larger impact on ARM productivity. As a result, the DMF began operating at ORNL on an isolated segment of the network outside of the ORNL network domain in early 2014. It used its own network security hardware and software, which prevented the DMF and ARM Data Archive from defining its own set of network rules and efficiently sharing network storage devices.

Later in the year, the ARM data network became the first research enclave, or private network, at ORNL, expanding to include a network structure that could also be used by the DMF. During this expansion, security policies were consolidated and all network hardware was upgraded from 1 gigabytes to 10 gigabytes. Upgrading the network speed and removing external network devices—that were previously between the Archive and DMF—changed data transfer rates between these systems from several hours per terabyte to several terabytes per hour.

Privacy has its Perks with Future Benefits

Formation of a “private ARM network” on the ORNL computer network allows the ARM Data Archive and DMF systems to continuously share storage devices, system administration processes, and define ARM specific rules for persistent network connections. With this structure, the DMF and ARM Data Archive have access to a shared online copy of some of the archived data. The scope and specific contents of the online repository is a joint collaboration of DMF and ARM Data Archive processes for the control, maintenance, and review of the data. The DMF uses this data copy for software development and value-added processing, while the Data Quality Office uses it to evaluate historical trends. The ARM Data Archive uses this same copy for distribution of data to the users.

During FY 2015, a new shared file server will be implemented as “temporary work space” to accommodate inherent variations in data flow inside both DMF and ARM Data Archive processes. Further integration of these connections will include control of data flows that is based on the readiness of downstream data systems instead of being driven by the accomplishments of upstream systems. This type of flow control will improve stability of the data flow and will allow for scaling into dynamic numbers of parallel systems per process. The more stable flow and flexible implementation of capacity will enable the data system reliability needed to meet the anticipated, large growth in data rates over the next few years.

# # #

The ARM Climate Research Facility is a national scientific user facility funded through the U.S. Department of Energy’s Office of Science. The ARM Facility is operated by nine Department of Energy national laboratories.