Skip to Main Content
 

Global Search Box

 
 
 
 

Files

ETD Abstract Container

Abstract Header

Automatic Parallelization of Loops with Data Dependent Control Flow and Array Access Patterns

Ravishankar, Mahesh

Abstract Details

2014, Doctor of Philosophy, Ohio State University, Computer Science and Engineering.
With the era of increasing clock speeds coming to an end, parallel computing architectures have now become main-stream. Due to the wide range of architectures available today that can be used to exploit parallelism, ranging from multicore CPUs, to GPGPUs, to distributed memory machines; adapting applications for efficient execution on all these architectures poses a significant challenge. Scientific computing applications exhibit significant coarse-grained parallelism. Domain experts have exploited this to target distributed memory machines through the use of Message Passing Interface (MPI) libraries. Many such applications have been shown to scale to hundreds of thousands of processors. While powerful, programming in MPI is tedious and error prone, with significant portion of the parallelized application dedicated to managing communication and synchronization between the processes. Developing compiler techniques that can automatically generate parallel distributed memory versions of such codes is challenging due to the presence of data-dependent control flow and data access patterns. In this thesis we develop compiler algorithms for automatic parallelization of a class of computations common to many scientific applications. Under the inspector/executor paradigm, the generated code partitions and executes the computation in load-balanced manner, while reducing the communication costs. This approach is further enhanced by developing a framework capable of expressing both affine and non-affine parts of the code. This enables the use of polyhedral compilation tools to analyze parts of the computation which can be completely characterized statically. The effectiveness of the developed approach is demonstrated on several benchmarks and real-world applications. Image processing applications on the other hand exhibit significant fine-grained parallelism and are well suited for architectures with Single-Instruction Multiple Data (SIMD) processing units like the vector processing units on CPUs or special graphics processing units like the NVIDIA Kepler GPU kepler. Such architectures can be targeted either through hardware specific intrinsics (like SSE/AVX to target vector units), or through language support for writing device-specific code (like CUDA to target NVIDIA GPUs). State of the art compilers like ICC, Pluto, etc. can automatically implement architecture specific optimizations for programs written in low-level languages like C/C++. While effective, these approaches still require significant effort from the application developers. Programs written this way are less portable and difficult to maintain. Domain Specific Languages offer a convenient abstraction, allowing programmers to specify the computation at a high-level, while still allowing the compiler to use the semantics of operations to generate efficient code to target multiple architectures. In this thesis, we develop Forma, a DSL that provides a convenient syntax to specify many common image processing pipelines in a succinct manner. The compiler backend can generate code to target both multicore CPUs with SIMD units and NVIDIA GPUs, while making use of device specific features like texture units on GPUs. These ease of programming in Forma is demonstrated by porting complex image processing pipelines like Laplacian Pyramids and Exposure Fusion. The performance of the generated code is compared against a state of the art DSL for image processing, Halide.
P Sadayappan, Prof. (Advisor)
Atanas Rountev, Prof. (Committee Member)
Gagan Agrawal, Prof. (Committee Member)
197 p.

Recommended Citations

Citations

  • Ravishankar, M. (2014). Automatic Parallelization of Loops with Data Dependent Control Flow and Array Access Patterns [Doctoral dissertation, Ohio State University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=osu1400085733

    APA Style (7th edition)

  • Ravishankar, Mahesh. Automatic Parallelization of Loops with Data Dependent Control Flow and Array Access Patterns. 2014. Ohio State University, Doctoral dissertation. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=osu1400085733.

    MLA Style (8th edition)

  • Ravishankar, Mahesh. "Automatic Parallelization of Loops with Data Dependent Control Flow and Array Access Patterns." Doctoral dissertation, Ohio State University, 2014. http://rave.ohiolink.edu/etdc/view?acc_num=osu1400085733

    Chicago Manual of Style (17th edition)