Skip to Main Content
Frequently Asked Questions
Submit an ETD
Global Search Box
Need Help?
Keyword Search
Participating Institutions
Advanced Search
School Logo
Files
File List
dissertation.pdf (1.16 MB)
ETD Abstract Container
Abstract Header
Performance Optimization of Stencil Computations on Modern SIMD Architectures
Author Info
Henretty, Thomas Steel
Permalink:
http://rave.ohiolink.edu/etdc/view?acc_num=osu1408937226
Abstract Details
Year and Degree
2014, Doctor of Philosophy, Ohio State University, Computer Science and Engineering.
Abstract
Performance of scientific computing codes on modern high-performance computing (HPC) systems has, in some cases, not achieved a significant percentage of the system’s peak performance. Three of the fundamental causes of this lack of efficiency are (1) less than optimal utilization of the short-vector SIMD units found in nearly all modern HPC systems, (2) less than optimal utilization of the memory hierarchy and (3) less than optimal utilization of all computing cores available in a system. Codes that are able to overcome one or more of these limitations are generally very complex and their implementation requires both an expert programmer and a substantial amount of time. In this work, a class of scientific computing codes known stencil computations is examined and shown to exhibit a fundamental algorithmic limitation that interferes with the generation of optimal SIMD code. A data layout transformation (DLT) to overcome this limitation is described and comprehensive results for cache-resident problem sizes are presented. It is shown that this DLT can significantly increase the performance of stencil computations on modern SIMD architectures. While substantial performance gains can be realized using the DLT for small problem sizes, larger problem sizes require the application of spatial and temporal loop tiling techniques to relieve pressure on the memory subsystem and exploit all available multicore parallelism. Two closely related tiling techniques, nested and hybrid split tiling, are developed and shown to exhibit high performance across a variety of modern multicore SIMD architectures and stencil benchmarks. Combining SIMD, memory hierarchy, and parallelism optimizations for stencil computations leads to code that is very complex and difficult for scientists and even seasoned programmers to implement. Further, these optimizations are difficult to integrate into a general purpose compiler as there is no existing framework for reliably identifying and representing stencil computations in a general purpose language such as C. These problems are resolved with the creation of the Stencil Domain Specific Language (SDSL). This language uses data structures and concepts specific to stencil computations to enable the retention of fundamental information about the stencil throughout the compilation process. Preserving the details of a stencil computation enables the automated generation of complex, highly optimized code for multiple parallel vector architectures from a simple specification in SDSL.
Committee
P Sadayappan, PhD (Advisor)
Atanas Rountev, PhD (Committee Member)
Radu Teodorescu, PhD (Committee Member)
Pages
176 p.
Subject Headings
Computer Science
Keywords
stencil
;
SIMD
;
PDE
;
DLT
;
SDSL
;
domain specific language
;
stream alignment conflict
;
split tiling
;
nested split tiling
;
hybrid split tiling
Recommended Citations
Refworks
EndNote
RIS
Mendeley
Citations
Henretty, T. S. (2014).
Performance Optimization of Stencil Computations on Modern SIMD Architectures
[Doctoral dissertation, Ohio State University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=osu1408937226
APA Style (7th edition)
Henretty, Thomas.
Performance Optimization of Stencil Computations on Modern SIMD Architectures.
2014. Ohio State University, Doctoral dissertation.
OhioLINK Electronic Theses and Dissertations Center
, http://rave.ohiolink.edu/etdc/view?acc_num=osu1408937226.
MLA Style (8th edition)
Henretty, Thomas. "Performance Optimization of Stencil Computations on Modern SIMD Architectures." Doctoral dissertation, Ohio State University, 2014. http://rave.ohiolink.edu/etdc/view?acc_num=osu1408937226
Chicago Manual of Style (17th edition)
Abstract Footer
Document number:
osu1408937226
Download Count:
2,585
Copyright Info
© 2014, some rights reserved.
Performance Optimization of Stencil Computations on Modern SIMD Architectures by Thomas Steel Henretty is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License. Based on a work at etd.ohiolink.edu.
This open access ETD is published by The Ohio State University and OhioLINK.