Skip to Main Content
 

Global Search Box

 
 
 
 

Files

ETD Abstract Container

Abstract Header

Parallel Processing Systems for Data and Computation Efficiency with Applications to Graph Computing and Machine Learning

Abstract Details

2019, Doctor of Philosophy, Ohio State University, Computer Science and Engineering.
Graph computing and machine learning play crucial roles in the Big Data era. Unfortunately, modern computer systems are facing increasingly performance challenges of running different applications that are undermining the benefits of available computational resources, emphasizing the need to explore efficient software/hardware co-design. Furthermore, new applications such as smart homes, smart cities, and autonomous vehicles are becoming increasingly important, driving the increased interest in deploying graph and machine learning applications on different computer systems. This dissertation proposes software/hardware co-design solutions that improve the data and computation efficiency and concurrency with applications to big data and machine learning. Depending on the characteristics of applications and targeting computer systems, dedicated designs are required to fully utilize the computational potentials. This dissertation concentrates on efficiently executing graph and machine learning applications on different computer systems, from a single computer, a distributed servers cluster, to the resource-constrained edge devices. To demonstrate the effectiveness of these techniques, this dissertation presents two graph processing systems with efficient data access and concurrency support, and then proposes adaptive parallel execution of neural networks on heterogeneous edge devices. To improve the data and computation efficiency for graph applications, this dissertation proposes to organize a graph as a set of edge-sets, and achieve better data locality by consolidating sparse edge-sets with multi-modal graph organization. The framework explores out-of-core execution by streaming the edge-sets from the disk on several cases of large scale linked data, in which it exhibits up to 10x performance improvement for a single machine as a result of several innovations. Another important contribution of this study relates to the comparison of vertex-centric and edge-centric engines on modern architectures. This work conducts throughout experiments and improves our understanding of the differences among these graph computational models. To improve the concurrency, this dissertation proposes C-Graph (i.e. Concurrent Graph), an edge-set based graph traversal framework with improved scalability for large scale graphs in a highly concurrent distributed environment. In contrast to most prior works, which focus on accelerating a single graph processing task, in industrial practice we consider multiple graph processing tasks running concurrently, such as a group of queries issued simultaneously to the same graph. The framework achieves both high concurrency and efficiency for k-hop reachability concurrent queries, by maintaining global vertex states and exploring shared access between edge-sets and among queries to facilitate graph traversals. We extend the framework for different types of applications by using a simple message passing mechanism with synchronous and asynchronous communication. We experimentally show that our framework obtains 20x ~ 70x speedup over baselines with up to 300+ concurrent queries. To utilize the available computational resources of edge devices in the Internet of Things (IoT) environments, this study proposes a runtime adaptive convolutional neural network (CNN) acceleration framework that is optimized for heterogeneous IoT environments. The framework leverages spatial partitioning techniques through the fusion of the convolution layers and dynamically selects the optimal degree of parallelism according to the availability of computational resources, as well as network conditions. Our evaluation shows that our framework outperforms state-of-art approaches by improving the inference speed and reducing communication costs while running on wirelessly-connected Raspberry-Pi3 devices. Experimental evaluation shows up to 1.9x ~ 3.7x speedup using 8 devices for three popular CNN models.
Radu Teodorescu (Advisor)
P. (Saday) Sadayappan (Committee Member)
Gagan Agrawal (Committee Member)
Feng Qin (Committee Member)
140 p.

Recommended Citations

Citations

  • Zhou, L. (2019). Parallel Processing Systems for Data and Computation Efficiency with Applications to Graph Computing and Machine Learning [Doctoral dissertation, Ohio State University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=osu156349344248694

    APA Style (7th edition)

  • Zhou, Li. Parallel Processing Systems for Data and Computation Efficiency with Applications to Graph Computing and Machine Learning. 2019. Ohio State University, Doctoral dissertation. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=osu156349344248694.

    MLA Style (8th edition)

  • Zhou, Li. "Parallel Processing Systems for Data and Computation Efficiency with Applications to Graph Computing and Machine Learning." Doctoral dissertation, Ohio State University, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=osu156349344248694

    Chicago Manual of Style (17th edition)