Skip to Main Content
 

Global Search Box

 
 
 
 

ETD Abstract Container

Abstract Header

Optimizing locality and parallelism through program reorganization

Krishnamoorthy, Sriram

Abstract Details

2008, Doctor of Philosophy, Ohio State University, Computer and Information Science.
Development of scalable application codes requires an understanding and exploitation of the locality and parallelism in the computation. This is typically achieved through optimizations by the programmer to match the application characteristics to the architectural features exposed by the parallel programming model. Partitioned address space programming models such as MPI foist a process-centric view of the parallel system, increasing the complexity of parallel programming. Typical global address space models provide a shared memory view that greatly simplifies programming. But the simplified models abstract away the locality information, precluding optimized implementations. In this work, we present techniques to reorganize program execution to optimize locality and parallelism, with little effort from the programmer. For regular loop-based programs operating on dense multi-dimensional arrays, we propose an automatic parallelization technique that attempts to determine a parallel schedule in which all processes can start execution in parallel. When the concurrent tiled iteration space inhibits such execution, we present techniques to re-enable it. This is an alternative to incurring the pipelined startup overhead in schedules generated by prevalent approaches. For less structured programs, we propose a programming model that exposes multiple levels abstraction to the programmer. These abstractions enable quick prototyping coupled with incremental optimizations. The data abstraction provides a global view of distributed data organized as blocks. A block is a subset of data stored contiguously in a single process’ address space. The computation is specified as a collection of tasks operating on the data blocks, with parallelism and dependence being specified between them. When the blocking of the data does not match the required access pattern in the computation, the data needs to be reblocked to improve spatial locality. We develop efficient data layout transformation mechanisms for blocked multi-dimensional arrays. We also present mechanisms for automatic management of load balance, disk I/O, and inter-process communication on computations expressed as sets of independent tasks on blocked data stored on disk.
P Sadayappan (Advisor)

Recommended Citations

Citations

  • Krishnamoorthy, S. (2008). Optimizing locality and parallelism through program reorganization [Doctoral dissertation, Ohio State University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=osu1197913392

    APA Style (7th edition)

  • Krishnamoorthy, Sriram. Optimizing locality and parallelism through program reorganization. 2008. Ohio State University, Doctoral dissertation. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=osu1197913392.

    MLA Style (8th edition)

  • Krishnamoorthy, Sriram. "Optimizing locality and parallelism through program reorganization." Doctoral dissertation, Ohio State University, 2008. http://rave.ohiolink.edu/etdc/view?acc_num=osu1197913392

    Chicago Manual of Style (17th edition)