Skip to Main Content
Frequently Asked Questions
Submit an ETD
Global Search Box
Need Help?
Keyword Search
Participating Institutions
Advanced Search
School Logo
Files
File List
thesis-sreeram-potluri.pdf (3.09 MB)
ETD Abstract Container
Abstract Header
Enabling Efficient Use of MPI and PGAS Programming Models on Heterogeneous Clusters with High Performance Interconnects
Author Info
Potluri, Sreeram
Permalink:
http://rave.ohiolink.edu/etdc/view?acc_num=osu1397797221
Abstract Details
Year and Degree
2014, Doctor of Philosophy, Ohio State University, Computer Science and Engineering.
Abstract
Accelerators (such as NVIDIA GPUs) and coprocessors (such as Intel MIC/Xeon Phi) are fueling the growth of next-generation ultra-scale systems that have high compute density and high performance per watt. However, these many-core architectures cause systems to be heterogeneous by introducing multiple levels of parallelism and varying computation/communication costs at each level. Application developers also use a hierarchy of programming models to extract maximum performance from these heterogeneous systems. Models such as CUDA, OpenCL, LEO, and others are used to express parallelism across accelerator or coprocessor cores, while higher level programming models such as MPI or OpenSHMEM are used to express parallelism across a cluster. The presence of multiple programming models, their runtimes and the varying communication performance at different levels of the system hierarchy has hindered applications from achieving peak performance on these systems. Modern interconnects such as InfiniBand, enable asynchronous communication progress through RDMA, freeing up the cores to do useful computation. MPI and PGAS models offer one-sided communication primitives that extract maximum performance, minimize process synchronization overheads and enable better computation and communication overlap using the high performance networks. However, there is limited literature available to guide scientists in taking advantage of these one-sided communication semantics on high-end applications, more so on heterogeneous clusters. In our work, we present an enhanced model, MVAPICH2-GPU, to use MPI for data movement from both CPU and GPU memories, in a unified manner. We also extend the OpenSHMEM PGAS model to support such unified communication. These models considerably simplify data movement in MPI and OpenSHMEM applications running on GPU clusters. We propose designs in MPI and OpenSHMEM runtimes to optimize data movement on GPU clusters, using state-of-the-art GPU technologies such as CUDA IPC and GPUDirect RDMA. Further, we introduce PRISM, a proxy-based multi-channel framework that enables an optimized MPI library for communication on clusters with Intel Xeon Phi co-processors. We evaluate our designs using micro-benchmarks, application kernels and end-applications. We present the re-design of a petascale seismic modeling code to demonstrate the use of one-sided semantics in end-applications and their impact on performance. We finally demonstrate the benefits of using one-sided semantics on heterogeneous clusters.
Committee
Dhabaleswar K. Panda (Advisor)
Ponnuswamy Sadayappan (Committee Member)
Radu Teodorescu (Committee Member)
Karen Tomko (Committee Member)
Pages
209 p.
Subject Headings
Computer Science
Keywords
Heterogeneous Clusters
;
GPU
;
MIC
;
Many-core Architectures
;
MPI
;
PGAS
;
One-sided
;
Communication Runtimes
;
InfiniBand
;
RDMA
;
Overlap
;
HPC Applications
Recommended Citations
Refworks
EndNote
RIS
Mendeley
Citations
Potluri, S. (2014).
Enabling Efficient Use of MPI and PGAS Programming Models on Heterogeneous Clusters with High Performance Interconnects
[Doctoral dissertation, Ohio State University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=osu1397797221
APA Style (7th edition)
Potluri, Sreeram.
Enabling Efficient Use of MPI and PGAS Programming Models on Heterogeneous Clusters with High Performance Interconnects.
2014. Ohio State University, Doctoral dissertation.
OhioLINK Electronic Theses and Dissertations Center
, http://rave.ohiolink.edu/etdc/view?acc_num=osu1397797221.
MLA Style (8th edition)
Potluri, Sreeram. "Enabling Efficient Use of MPI and PGAS Programming Models on Heterogeneous Clusters with High Performance Interconnects." Doctoral dissertation, Ohio State University, 2014. http://rave.ohiolink.edu/etdc/view?acc_num=osu1397797221
Chicago Manual of Style (17th edition)
Abstract Footer
Document number:
osu1397797221
Download Count:
1,414
Copyright Info
© 2014, all rights reserved.
This open access ETD is published by The Ohio State University and OhioLINK.