Skip to Main Content
 

Global Search Box

 
 
 
 

ETD Abstract Container

Abstract Header

Compile-time and Run-time Optimizations for Enhancing Locality and Parallelism on Multi-core and Many-core Systems

Baskaran, Muthu Manikandan

Abstract Details

2009, Doctor of Philosophy, Ohio State University, Computer Science and Engineering.

Current trends in computer architecture exemplify the emergence of multiple processor cores on a chip. The modern multiple-core computer architectures that include general-purpose multi-core architectures (from Intel, AMD, IBM, and Sun), and specialized parallel architectures such as the Cell Broadband Engine and Graphics Processing Units (GPUs) have very high computation power per chip. A significant challenge to be addressed in these systems is the effective load-balanced utilization of the processor cores. Memory subsystem has always been a performance bottleneck in computer systems and it is more so, with the emergence of processor subsystem with multiple on-chip processor cores. Effectively managing the on-chip and off-chip memories and enhancing data reuse to maximize memory performance is another significant challenge in modern multiple-core architectures.

Our work addresses these challenges in multi-core and many-core systems, through various compile-time and run-time optimization techniques. We provide effective automatic compiler support for managing on-chip and off-chip memory accesses, with the compiler making effective decisions on what elements to move in and move out of on-chip memory, when and how to move them, and how to efficiently access the elements brought into on-chip memory. We develop an effective tiling approach for mapping computation in regular programs on to many-core systems like GPUs. We develop an automatic approach for compiler-assisted dynamic scheduling of computation to enhance load balancing for parallel tiled execution on multi-core systems.

There are various issues that are specific to the target architecture which need attention to maximize application performance on the architecture. First, the levels of parallelism available and the appropriate granularity of parallelism needed for the target architecture have to be considered while mapping the computation. Second, the memory access model may be inherent to the architecture and optimizations have to be developed for the specific memory access model. We develop compile-time transformation approaches to address performance factors related to parallelism and data locality that are GPU architecture-specific, and develop an end-to-end compiler framework for GPUs.

Sadayappan Ponnuswamy, Dr. (Advisor)
Dhabaleswar Panda, Dr. (Committee Member)
Atanas Rountev, Dr. (Committee Member)
Jaganathan Ramanujam, Dr. (Committee Member)
145 p.

Recommended Citations

Citations

  • Baskaran, M. M. (2009). Compile-time and Run-time Optimizations for Enhancing Locality and Parallelism on Multi-core and Many-core Systems [Doctoral dissertation, Ohio State University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=osu1253557044

    APA Style (7th edition)

  • Baskaran, Muthu Manikandan. Compile-time and Run-time Optimizations for Enhancing Locality and Parallelism on Multi-core and Many-core Systems. 2009. Ohio State University, Doctoral dissertation. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=osu1253557044.

    MLA Style (8th edition)

  • Baskaran, Muthu Manikandan. "Compile-time and Run-time Optimizations for Enhancing Locality and Parallelism on Multi-core and Many-core Systems." Doctoral dissertation, Ohio State University, 2009. http://rave.ohiolink.edu/etdc/view?acc_num=osu1253557044

    Chicago Manual of Style (17th edition)