Our research team is responsible for developing an advanced system software stack’s. We are also conducting research and development for the K computer, its successor the post-K computer, and for future systems, taking into consideration continuity of the user environment and usability. The software stack under development is made up of the following:
- OS kernel
We are developing a lightweight kernel, called McKernel, for multi-core type parallel
computers. Applications running on Linux run on Mckernel without recompilation. McKernel is running on Intel ‘s latest Xeon Phi processor: Knights Landing (KNL).
- MPI communication library
We have implemented the MPICH communication library, an implementation of the MPI communication library developed mainly by Argonne National Laboratories, on the K and post-K computers. In particular, we are developing a mechanism that can efficiently use post-K computer communication hardware.
- File I / O library
We are developing a DTF file I/O library, realizing a real-time job-to-job file I/O and a FTAR file I/O library, and parallelizing a file I/O with the tar format.
Development of McKernel: a Linux-compatible lightweight kerne
Extreme degrees of parallelism and deep memory hierarchies in high-end computing require a novel runtime environment so that large-scale bulk-synchronous parallel applications run efficiently. An advanced runtime environment has been historically achieved by deploying lightweight kernels, though they are only able to provide a restricted set of the POSIX API. However, the increasing prevalence of more complex application constructs, such as in-situ analysis and workflow composition, dictates the need for the rich APIs of POSIX and Linux. In order to comply with these seemingly contradictory requirements, we are designing and implementing hybrid kernels, where Linux and a lightweight kernel are run side-by-side on compute nodes. We are developing a lightweight kernel called McKernel, and the interface between it and the Linux kernel is called IHK.
McKernel is booted from the Linux kernel without hardware rebooting. It retains a binary compatible interface with Linux, i.e., the same Linux binary runs on McKernel; however, it implements only a small set of performance-sensitive system calls, such as memory/process/thread management, the rest being delegated to Linux. One of the significant results, noise-less environment, is shown in the Figure below. The Fixed Work Quanta benchmark, provided by Sandia National Laboratory, reports how much execution times of fixed workloads deviate. The result shows that the fixed execution time in McKernel is almost constant, whereas a large deviation is observed in Linux.
We plan to make McKernel generally available on Oakforest-PACS, which is operated by the Joint Center for Advanced High Performance Computing that is run by the University of Tsukuba and the University of Tokyo.