Skip to Main Content
 

Global Search Box

 
 
 
 

Files

ETD Abstract Container

Abstract Header

High-Performance Heterogeneity/ Energy-Aware Communication for Multi-Petaflop HPC Systems

Akshay Venkatesh, .

Abstract Details

2017, Doctor of Philosophy, Ohio State University, Computer Science and Engineering.
Increasing machine throughput by increasing frequency alone is limited by power constraints. This has forced high-performance computing (HPC) systems to evolve into complex machines with very high core counts and node counts, often, augmented with accelerators and co-processors with their own memory and network subsystems. Two distinguishing architectural traits of today’s multi-petaflop HPC clusters that marks a paradigm shift from previous generations are the presence of many-core processors, and high-speed network devices with specialized hardware features. A large proportion of machines on the current TOP500 list of supercomputers exhibits such characteristics. These changes have come about with a two-fold end goal of minimizing both execution time and energy expenditure of scientific applications that use these systems. To reduce the time of compute phases, the high degrees of parallelism available on many-core processors are exploited. Of these processors, NVIDIA general-purpose graphics processing units (GPGPU) and Intel Many Integrated Core (MIC) processors are popular owing to their low power footprint and their ability to potentially produce over a teraflop/second throughput. The complexity with these processors lies in that they are often (if not always) available as PCIe devices with their own memory and network subsystems. This renders the nodes of the system heterogeneous from both a processor, memory, and the network perspective. To accommodate for this heterogeneity and to improve processor utilization in general, Network Interface Controllers (NICs) are being designed with a range of novel capabilities to reduce communication time and increase overlap possibilities. These include the ability to access the memory of these PCIe devices even when they are located on remote nodes directly through Remote Direct Memory Access (RDMA) and multicast features. In addition, NICs are being designed with the capability of accepting a list of basic data movement and dependency satisfaction primitives that can be used to de- sign communication routines that do not require CPU intervention for progression. Lastly, NICs and accelerators such as NVIDIA GPUs are being co-designed to allow the GPU to operate autonomously and issue network operations nearly independent of the CPU. This bears the potential to realize efficient control plane decoupling and achieve an overall in- crease in compute-resource utilization. While reducing execution time is important, saving system energy is equally vital. One of the main contributors to the system’s energy consumption is the CPU as it is common to employ polling schemes for communication latency optimization. To reduce the energy footprint of processors, power-knobs such as Dynamic Voltage Frequency Scaling (DVFS), and interrupt-driven execution modes are often used. These are especially important during communication phases as they often result in energy expenditure far larger than that in compute-phases as recent studies have indicated. In view of these developments, communication runtimes must aim to take into consideration the abundance of heterogeneity that abounds in modern HPC systems and leverage advancements in network design to achieve low latency, high throughput and to increase computation/communication overlap possibilities. At the same time, they must also aim to reduce energy expended by CPUs during communication phases with minimal or without affecting overall performance. As Message Passing Interface (MPI) is the de facto standard for communication calls in scientific applications, this dissertation proposes high- performance designs while addressing heterogeneity and energy challenges for the MPI communication routines on modern HPC systems. In particular, this dissertation attempts to address heterogeneity challenges in communication algorithms for dense collective operations through the design of novel heterogeneity- aware collective designs. The proposed designs are considerate of the cost differences in communication paths and leverage delegation mechanisms, propose adaptations to clas- sic algorithms, and use novel heuristics to achieve large benefits in collectives (such as MPI Alltoall and MPI Allgather) at scale. The dissertation also takes advantage of special- capability network subsystems that have peer-to-peer access with other PCIe devices, RDMA and multicast capabilities, offload mechanisms, and the ability to service network requests without explicit CPU intervention. These are used to design point-to-point and collective operations that overcome the challenges posed by heterogeneous memory subsystems and yield high throughput, overlap, and reduced coupling-induced synchronization overheads. Finally, the dissertation proposes rules for application-oblivious energy savings during 2- sided and 1-sided MPI routines through intimate knowledge of the underlying protocols used to realize communication routines. This results in applications automatically saving CPU and memory energy during communication phases with minimal performance degradation and without needing application code changes. The proposed designs for MPI have been integrated into the MVAPICH2 MPI library and are already widely deployed on HPC clusters.
Dhabaleswar Panda (Advisor)
221 p.

Recommended Citations

Citations

  • Akshay Venkatesh, . (2017). High-Performance Heterogeneity/ Energy-Aware Communication for Multi-Petaflop HPC Systems [Doctoral dissertation, Ohio State University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=osu1483522226773739

    APA Style (7th edition)

  • Akshay Venkatesh, .. High-Performance Heterogeneity/ Energy-Aware Communication for Multi-Petaflop HPC Systems. 2017. Ohio State University, Doctoral dissertation. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=osu1483522226773739.

    MLA Style (8th edition)

  • Akshay Venkatesh, .. "High-Performance Heterogeneity/ Energy-Aware Communication for Multi-Petaflop HPC Systems." Doctoral dissertation, Ohio State University, 2017. http://rave.ohiolink.edu/etdc/view?acc_num=osu1483522226773739

    Chicago Manual of Style (17th edition)