Skip to Main Content
 

Global Search Box

 
 
 
 

ETD Abstract Container

Abstract Header

Implementation and Performance Analysis of Many-body Quantum Chemical Methods on the Intel Xeon Phi Coprocessor and NVIDIA GPU Accelerator

Abstract Details

2016, Master of Science, Ohio State University, Computer Science and Engineering.
CCSD(T), part of coupled cluster (CC) method, is one of the most accurate methods applicable to reasonably large molecules in computational chemistry field. The ability of an efficient parallel CCSD(T) implementation will have a significant impact on application of the high-accuracy methods. Intel Xeon Phi Coprocessor and NVIDIA GPU are the most important coprocessors/accelerators which has powerful parallel computing ability due to its massively parallel many-core architecture. In this work, CCSD(T) code is implemented on Intel Xeon Phi Coprocessor and NVIDIA GPU. CCSD(T) method performs tensor contractions. In order to have an efficient implementation, we allocate the result tensor only on Intel Xeon Phi Coprocessor or GPU, and keep result tensor on the coprocessor/accelerator to receive a sequence of results from tensor contraction performed on the Intel Xeon Phi Coprocessor or GPU. The input tensors are offloaded from the host to the coprocessor/accelerator for each tensor contraction. After all the tensor contractions are finished, the final result is accumulated on the coprocessor/accelerator to avoid huge data transfer from coprocessor/accelerator to host. The tensor contraction are performed using BLAS DGEMM on coprocessor/accelerator. Then the result is post-processed using a 6 dimensional loop. For Intel Xeon Phi implementation, OpenMP is used to bind threads to physical processing units on Xeon Phi coprocessors. The OpenMP threads affinity are tuned for Intel Xeon Phi Coprocessor to obtain best performance. For GPU, a algorithm is designed to map the 6 dimensional loop (post-processing) to CUDA threads. gridDim and blockDim are tuned to reach best performance. 4x and 9x ~ 13x overall speedup is obtained for Intel Xeon Phi and GPU implementation, respectively.
P. Sadayappan (Advisor)
Louis-Noel Pouchet (Committee Member)
54 p.

Recommended Citations

Citations

  • Shi, B. (2016). Implementation and Performance Analysis of Many-body Quantum Chemical Methods on the Intel Xeon Phi Coprocessor and NVIDIA GPU Accelerator [Master's thesis, Ohio State University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=osu1462793739

    APA Style (7th edition)

  • Shi, Bobo. Implementation and Performance Analysis of Many-body Quantum Chemical Methods on the Intel Xeon Phi Coprocessor and NVIDIA GPU Accelerator. 2016. Ohio State University, Master's thesis. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=osu1462793739.

    MLA Style (8th edition)

  • Shi, Bobo. "Implementation and Performance Analysis of Many-body Quantum Chemical Methods on the Intel Xeon Phi Coprocessor and NVIDIA GPU Accelerator." Master's thesis, Ohio State University, 2016. http://rave.ohiolink.edu/etdc/view?acc_num=osu1462793739

    Chicago Manual of Style (17th edition)