Skip to Main Content
Frequently Asked Questions
Submit an ETD
Global Search Box
Need Help?
Keyword Search
Participating Institutions
Advanced Search
School Logo
Files
File List
JUN_2015_MCS_Thesis Final Version 4-21 final format approved LW EH CN 4-24-15.pdf (1.33 MB)
ETD Abstract Container
Abstract Header
Using Hadoop to Cluster Data in Energy System
Author Info
Hou, Jun
Permalink:
http://rave.ohiolink.edu/etdc/view?acc_num=dayton1430092547
Abstract Details
Year and Degree
2015, Master of Computer Science (M.C.S.), University of Dayton, Computer Science.
Abstract
With the large amount of data generated by various devices, data scientists face big challenges since conditional machine learning algorithms applied on a single computer can no longer be used for processing/analyzing such large data sets. This thesis takes a distributed computing approach built upon Apache Hadoop, which is a distributed data analysis framework running on multiple computers. The main components of this work includes implementation of k-means machine learning algorithms on the Hadoop Map-Reduce framework, processing raw data from real energy systems, classifying the data using k-means algorithms in Hadoop, and improvement on seed selection for k-means algorithms. Finally, this thesis demonstrates the efficiency and effectiveness of our approach using different data sets.
Committee
Zhongmei Yao (Committee Chair)
Mehdi Zargham (Committee Member)
Saverio Perugini (Committee Member)
Pages
51 p.
Subject Headings
Computer Science
Keywords
Hadoop
;
K-means
;
energy data
;
clustering analysis
Recommended Citations
Refworks
EndNote
RIS
Mendeley
Citations
Hou, J. (2015).
Using Hadoop to Cluster Data in Energy System
[Master's thesis, University of Dayton]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=dayton1430092547
APA Style (7th edition)
Hou, Jun.
Using Hadoop to Cluster Data in Energy System.
2015. University of Dayton, Master's thesis.
OhioLINK Electronic Theses and Dissertations Center
, http://rave.ohiolink.edu/etdc/view?acc_num=dayton1430092547.
MLA Style (8th edition)
Hou, Jun. "Using Hadoop to Cluster Data in Energy System." Master's thesis, University of Dayton, 2015. http://rave.ohiolink.edu/etdc/view?acc_num=dayton1430092547
Chicago Manual of Style (17th edition)
Abstract Footer
Document number:
dayton1430092547
Download Count:
2,101
Copyright Info
© 2015, all rights reserved.
This open access ETD is published by University of Dayton and OhioLINK.