Session 1
Session 1
1.1 Project Talk: Automatic I/O Scheduling algorithm selection for parallel file systems ”Automatic I/O Scheduling algorithm selection for parallel file systems - Integration and first results”
Ramon Nou (BSC) / Francieli Zanon (INRIA)
We introduced the ability to use intra-workload I/O scheduling changes in AGIOS (presented at 3rd JLESC Workshop),guided by two complementary methods: Armed Bandid, a probability-guided approach when we do not know the workload and markov chain using pattern matching to predict which is the next best scheduler for the next period. Results with AB are obtaining high performance over the standard OrangeFS. AGIOS is from UFGRS/INRIA
and BSC used AB and pattern matching for automatic kernel I/O scheduler changes. Next steps will include the pattern matching incorporation to the system.
1.2 Project Talk: Toward taming large and complex data flows in data-centric supercomputing ”Topology-Aware Data Aggregation for Intensive I/O on Large-Scale Supercomputers”
Francois Tessier (ANL)
Reading and writing data efficiently from storage systems is critical for high performance data-centric applications. These I/O systems are being increasingly characterized by complex topologies and deeper memory hierarchies. Effective parallel I/O solutions are needed to scale applications on current and future supercomputers. Data aggregation is an efficient approach consisting of electing some processes in charge of aggregating data from a set of neighbors and writing the aggregated data into storage. Thus, the bandwidth use can be optimized while the contention is reduced. In this work, we take into account the network topology for mapping aggregators and we propose an optimized buffering system in order to reduce the aggregation cost. We validate our approach using micro-benchmarks and the I/O kernel of a large-scale cosmology simulation. We show improvements up to 15x faster for I/O operations compared to a standard implementation of MPI I/O.
1.3 Individual Talk: Mochi: composable lightweight data services for HPC
Philip Carns (ANL)
Parallel file systems have formed the basis for data management in HPC for decades, but specialized data services are increasingly prevalent in areas such as library management, fault tolerance, code coupling, burst buffer management, and in situ analytics. These specialized data services are effective because they can more readily provide semantics and functionality tailored to the task at hand than a one-size-fits-all storage system. Adoption of specialized data services is limited by their flexibility and portability, however. Our goal in Mochi is to develop a reusable collection of HPC micro-services that can be composed and customized to rapidly construct new domain-specific or even application-specific data services. We present updates from services under development that highlight this functionality and explore how to lower the barrier to entry for new HPC data services.
1.4 Individual Talk: NVRAM POSIX-like Filesystem with I/O hints support
Alberto Miranda/Ramon Nou (BSC)
As a work in progress for the NEXTGenIO european project, the filesystem will transparently work as a collaborative burst buffer between all the NVDIMMs (or other technologies) available on the compute nodes. It will support user I/O hints to specify distribution and data’s lifecycle. Collaboration for applications, and testing environments.
1.5 Individual Talk: An Argument for On-Demand and HPC Convergence
Kate Keahey (ANL)
There is currently a divergence in the scientific community. On one hand, we have HPC resources, typically managed by batch schedulers, which offer very powerful capabilities but do not satisfy user QoS issues such as on-demand availability. On the other hand therefore, we have relatively small clusters, typically operated by experimental facilities that provide controlled availability for the users but are under-utilized and do not have the resources to scale. We propose an experiment that combines those resources in an approach based mostly on commodity software technologies and explores the consequences of such a merger. We evaluate our approach experimentally by examining two years’ worth of traces from the experimental cluster at the Advance Photon Source at ANL and batch trace from the Lab Computing Resource Center (LCRC) at ANL and enacting selected scenarios on hundreds of nodes. Our results demonstrate significant benefits in cost, utilization, and availability.
1.6 Individual Talk: Significantly Improving Lossy Compression for Scientific HPC Data based on Multidimensional Prediction and Error-controlled Quantization
Sheng Di (ANL)
Today’s HPC applications are producing extremely large amounts of data, such that data storage and its performance are becoming a serious problem for scientific research. In this work, we design a new error-controlled lossy compression algorithm for the large-scale high-entropy scientific data sets. Our key contribution is significantly improving the prediction hitting rate (or accuracy) for each data point based on its nearby data values along multiple dimensions. On the one hand, we carefully derive a series of multi-layer prediction formulas in the context of data compression. One serious challenging issue is that the data prediction has to be performed based on the decompressed values during the compression for guaranteeing the error bounds, which may degrade the prediction accuracy in turn. As such, we explore the best layer for the prediction by considering the impact of decompression errors on the prediction accuracy. On the other hand, we propose an adaptive error-controlled quantization encoder, which can further improve the prediction hitting rate a lot. The data size can be reduced significantly after performing the variable-length encoding because of the fairly uneven distribution produced by our quantization encoder. We evaluate the new compressor by production scientific data sets, and compare it to many other state-of-the-art compressors, including GZIP, FPZIP, ZFP, SZ, ISABELA, etc.. Experiments show that our compressor is the best in class, especially on compression factors (or bit-rates) and compression errors (including RMSE, NRMSE, PSNR, etc.). Our solution is better than the second-best solution ZFP by nearly 2.3x increase in compression factor and 5.4x reduction in normalized root mean squared error on average with reasonable error bounds and user-desired bit-rates.