Skip to Main Content
 

Global Search Box

 
 
 
 

ETD Abstract Container

Abstract Header

Optimization of Stencil Computations on GPUs

Abstract Details

2018, Doctor of Philosophy, Ohio State University, Computer Science and Engineering.
Stencil computations form the compute-intensive core of many scientific application domains, such as image processing of CT and MRI imaging, computational electromagnetics, seismic processing, and climate modeling. A stencil computation involves element-wise update of an output domain based on a fixed set of neighboring points from the input domain. Such stencil computations are either time iterated, or require successive application of multiple stencil operators on the input domains. Stencil optimization on multi- and many-core architectures has been an active research topic for the past two decades. Stencil computations traditionally have low arithmetic intensity with only a few floating-point operations performed relative to the data transferred per output point, and are therefore memory bandwidth-bound. Since the data movement cost consistently dominates the computational cost in modern architectures, most of these research efforts focus on reducing the data movement in stencils to tackle the bandwidth bottleneck. Consequently, several tiling techniques have been proposed over the years to exploit spatial and temporal reuse across a sequence of stencils or across multiple time steps for time iterated stencil. With the ever-increasing use of GPUs for general purpose computing, application developers have started exploring the acceleration of data-parallel stencils on GPUs. GPUs have lower data movement costs than the multi-core CPU architectures, and hence are an attractive target for accelerating memory bandwidth-bound stencil computations. At the same time, GPUs are compute-intensive with significantly higher number of registers per thread, and therefore suitable for accelerating stencil computations with high arithmetic intensity as well. The arithmetic intensity of a stencil is proportional to its order, which is the number of input elements read from the center along each dimension. In many scientific applications, high-order stencils provide better computational accuracy with lesser data movement than their low-order counterparts. However, the main performance bottleneck for high-order stencils on GPUs is the high register pressure, which causes excessive register spills or a steep drop in achieved parallelism, resulting in a subsequent performance loss. This dissertation proposes novel GPU-centric optimization strategies that address the performance bottlenecks for stencils with different arithmetic intensities: tiling and fusion heuristics for bandwidth-bound stencils with low arithmetic intensity, and register optimizations for high-order stencils with high arithmetic intensity. The proposed optimizations have been implemented into a DSL based stencil optimization framework, STENCILGEN, that can automatically generate high-performance CUDA code from an input DSL specification of the stencil computation. The efficacy of the proposed optimizations is demonstrated via empirical evaluation on a variety of 2D and 3D stencil kernels extracted from PDE solvers, image processing pipelines, and proxy DOE applications.
Sadayappan P., Dr. (Advisor)
Atanas Rountev, Dr. (Committee Member)
Gagan Agrawal, Dr. (Committee Member)
201 p.

Recommended Citations

Citations

  • Rawat, P. S. (2018). Optimization of Stencil Computations on GPUs [Doctoral dissertation, Ohio State University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=osu1523037713249436

    APA Style (7th edition)

  • Rawat, Prashant. Optimization of Stencil Computations on GPUs. 2018. Ohio State University, Doctoral dissertation. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=osu1523037713249436.

    MLA Style (8th edition)

  • Rawat, Prashant. "Optimization of Stencil Computations on GPUs." Doctoral dissertation, Ohio State University, 2018. http://rave.ohiolink.edu/etdc/view?acc_num=osu1523037713249436

    Chicago Manual of Style (17th edition)