Skip to Main Content
 

Global Search Box

 
 
 
 

ETD Abstract Container

Abstract Header

RDMA-based Plugin Design and Profiler for Apache and Enterprise Hadoop Distributed File system

Bhat, Adithya

Abstract Details

2015, Master of Science, Ohio State University, Computer Science and Engineering.
International Data Corporation states that the amount of global digital data is doubling every year. This increasing trend necessitates an increase in the computational power required to process the data and extract meaningful information. In order to alleviate this problem, HPC clusters equipped with high speed InfiniBand Interconnects are deployed. This is leading to many Big Data applications to be deployed on HPC clusters. The RDMA capabilities of InfiniBand have shown to improve the performance of Big Data applications like Hadoop, HBase and Spark on HPC systems. Since Hadoop is Open Source software, many vendors offer their own distribution with their own optimization or added functionalities. This restricts easy portability of any enhancement across Hadoop distributions. In this thesis, we present RDMA-based plugin design for Apache and Enterprise Hadoop distributions. Here we take existing RDMA-enhanced designs for HDFS write that is tightly integrated into Apache Hadoop and propose a new RDMA based plugin design. The proposed design utilizes the parameters provided by Hadoop to load client and server RDMA modules. The plugin is applicable to Apache, Hortonworks and Cloudera’s Hadoop distribution. We develop a HDFS profiler using Java instrumentation. The HDFS profiler evaluates performance benefit or bottleneck in HDFS. This profiler can work across Hadoop distribution and does not require any Hadoop source code modification. Based on our experimental evaluation, our plugin ensures the expected performance of up to 3.7x improvement in TestDFSIO write, associated with the RDMA-enhanced design, to all the distributions. We also demonstrate that our RDMA-based plugin can achieve up to 4.6x improvement over Mellanox R4H (RDMA for HDFS) plugin. Using HDFS profiler we show that RDMA-enhanced HDFS designs improve the performance of Hadoop applications that have frequent HDFS write operations.
Dhabaleswar Panda (Advisor)
Feng Qin (Committee Member)
65 p.

Recommended Citations

Citations

  • Bhat, A. (2015). RDMA-based Plugin Design and Profiler for Apache and Enterprise Hadoop Distributed File system [Master's thesis, Ohio State University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=osu1440188090

    APA Style (7th edition)

  • Bhat, Adithya. RDMA-based Plugin Design and Profiler for Apache and Enterprise Hadoop Distributed File system. 2015. Ohio State University, Master's thesis. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=osu1440188090.

    MLA Style (8th edition)

  • Bhat, Adithya. "RDMA-based Plugin Design and Profiler for Apache and Enterprise Hadoop Distributed File system." Master's thesis, Ohio State University, 2015. http://rave.ohiolink.edu/etdc/view?acc_num=osu1440188090

    Chicago Manual of Style (17th edition)