Skip to Main Content
 

Global Search Box

 
 
 
 

ETD Abstract Container

Abstract Header

Secure and Privacy-Aware Machine Learning

Abstract Details

2019, Doctor of Philosophy, Case Western Reserve University, EECS - Computer Engineering.
With the onset of the big data era, designing efficient and secure machine learning frameworks to analyze large-scale data is in dire need. This dissertation considers two machine learning paradigms, the centralized learning scenario, where we study the secure outsourcing problem in cloud computing, and the distributed learning scenario, where we explore blockchain techniques to remove the untrusted central server to solve the security problems. In the centralized machine learning paradigm, inference using deep neural networks (DNNs) may be outsourced to the cloud due to its high computational cost, which, however, raises security concerns. Particularly, the data involved in DNNs can be highly sensitive, such as in medical, financial, commercial applications, and hence should be kept private. Besides, DNN models owned by research institutions or commercial companies are their valuable intellectual properties and can contain proprietary information, which should be protected as well. Moreover, an untrusted cloud service provider may return inaccurate and even erroneous computing results. To address above issues, we propose a secure outsourcing framework for deep neural network inference called SecureNets, which can preserve both a user's data privacy and his/her neural network model privacy, and also verify the computation results returned by the cloud. Specifically, we employ a secure matrix transformation scheme in SecureNets to avoid privacy leakage of the data and the model. Meanwhile, we propose a verification method that can efficiently verify the correctness of cloud computing results. Our simulation results on four- and five-layer deep neural networks demonstrate that SecureNets can reduce the processing runtime by up to 64%. Compared with CryptoNets, one of the previous schemes, SecureNets can increase the throughput by 104.45% while reducing the data transmission size by 69.78% per instance. We further improve the privacy level in SecureNets and implement it in a practical scenario. The Internet of Things (IoT) emerge as a ubiquitous information collection and processing paradigm that can potentially exploit the collected massive data for various applications like smart health, smart transportation, cyber-physical systems, by taking advantage of machine learning technologies. However, these data are usually unlabeled, while the labeling process is usually both time and effort consuming. Active learning is one approach to reduce the data labeling cost by only sending the most informative samples to experts for labeling. In this process, two most computation-intensive operations, i.e., sample selection and learning model training, hinder the use of active learning on resource-limited IoT devices. To address this issue, we develop a secure outsourcing framework for deep active learning (SEDAL) by considering a general active learning framework with a deep neural network (DNN) learning model. The improved SecureNets is adopted in the model inferences in sample selection and DNN learning phases. Compared with traditional homomorphic encryption based secure outsourcing schemes, our scheme reduces the computational complexity at the user from O(n^3) to O(n^2). To evaluate the performance of the proposed system, we implement it on an android phone and Amazon AWS cloud for an arrhythmia diagnosis application. Experiment results show that the proposed scheme can obtain a well-trained classifier using fewer queried samples, and the computation time and communication overhead are acceptable and practical. Besides the centralized learning paradigms, in practice, data can also be generated by multiple parties and stored in a geographically distributed manner, which spurs the study of distributed machine learning. Traditional master-worker type of distributed machine learning algorithms assumes a trusted central server and focuses on the privacy issue in linear learning models, while privacy in nonlinear learning models and security issues are not well studied. To address these issues, in this work, we explore the blockchain technique to propose a decentralized privacy-preserving and secure machine learning system, called LearningChain, by considering a general (linear or nonlinear) learning model and without a trusted central server. Specifically, we design a decentralized Stochastic Gradient Descent (SGD) algorithm to learn a general predictive model over the blockchain. In decentralized SGD, we develop differential privacy based schemes to protect each party's data privacy, and propose an l-nearest aggregation algorithm to protect the system from potential Byzantine attacks. We also conduct theoretical analysis on the privacy and security of the proposed LearningChain. Finally, we implement LearningChain and demonstrate its efficiency and effectiveness through extensive experiments.
Pan Li (Advisor)
Loparo Kenneth (Committee Member)
An Wang (Committee Member)
Ayday Erman (Committee Member)
112 p.

Recommended Citations

Citations

  • Chen, X. (2019). Secure and Privacy-Aware Machine Learning [Doctoral dissertation, Case Western Reserve University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=case1563196765900275

    APA Style (7th edition)

  • Chen, Xuhui. Secure and Privacy-Aware Machine Learning. 2019. Case Western Reserve University, Doctoral dissertation. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=case1563196765900275.

    MLA Style (8th edition)

  • Chen, Xuhui. "Secure and Privacy-Aware Machine Learning." Doctoral dissertation, Case Western Reserve University, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=case1563196765900275

    Chicago Manual of Style (17th edition)