Skip to Main Content
 

Global Search Box

 
 
 
 

Files

ETD Abstract Container

Abstract Header

Automatic Transformation and Optimization of Applications on GPUs and GPU clusters

Abstract Details

2011, Doctor of Philosophy, Ohio State University, Computer Science and Engineering.

Modern accelerators and multi-core architectures offer significant

computing power at a very modest cost.

With this trend, an important research issue at the software end

is how to make the best use of these computing devices, and how to enable

high performance without the users having to put too much effort into learning the

architecture and the programming model.

Our goal is to address the above problem by developing automatic

code generation systems, particularly for GPUs and

GPU clusters. We believe that by focusing on specific application

classes, the task of automatic code generation can be significantly

simplified. Thus, we made efforts in providing code generation and optimization systems for two classes of applications: data-intensive applications with generalized reductions, and tensor contraction functions. First, we focused on a class

of data-intensive applications, whose processing structure is

of generalized reductions.

In the code generation systems we have built, the user input are algorithms written in high-level

languages, specifically, C or MATLAB. Program analysis and code generation is

performed to generate code for a single GPU, or a GPU cluster.

The three specific systems we have built are

GREENRIDE, a code

generation system to provide GPU support for C programs; GMAT-DM, which

translates MATLAB code into GPU executable program; and AUTO-GC, which provides GPU support

for clusters, by incorporating code generation for FREERIDE, which

is a middleware supporting parallel execution for data mining.

For tensor contractions, we evaluated the automatically generated code on different GPUs, and made investigation in the algorithm optimization for each card. It led to an auto-tuning framework which selects algorithms and parameters according to some cost model and thresholds extracted from simple micro-benchmarks. We also developed a loop transformation system in the environment of multi-level memory hierarchy. By focusing on the dominating factors of the computation, we were able to remove a large portion of extra data movement between memory hierarchies.

In future, we plan to extend our work in the following directions. The code generation system for data intensive applications with reduction patterns could be applied and optimized for other classes of applications. The integer programming model could also be used for other architectures, including future accelerators. We would like to consider heterogeneous systems for the loop transformation approach. The auto-tuning framework will be extended to include more parameters, enabling better performance gain.

Gagan Agrawal (Advisor)
Qin Feng (Committee Member)
Atanas Rountev (Committee Member)
Steven Bybic (Committee Member)
197 p.

Recommended Citations

Citations

  • Ma, W. (2011). Automatic Transformation and Optimization of Applications on GPUs and GPU clusters [Doctoral dissertation, Ohio State University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=osu1300972089

    APA Style (7th edition)

  • Ma, Wenjing. Automatic Transformation and Optimization of Applications on GPUs and GPU clusters. 2011. Ohio State University, Doctoral dissertation. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=osu1300972089.

    MLA Style (8th edition)

  • Ma, Wenjing. "Automatic Transformation and Optimization of Applications on GPUs and GPU clusters." Doctoral dissertation, Ohio State University, 2011. http://rave.ohiolink.edu/etdc/view?acc_num=osu1300972089

    Chicago Manual of Style (17th edition)