Skip to Main Content
Frequently Asked Questions
Submit an ETD
Global Search Box
Need Help?
Keyword Search
Participating Institutions
Advanced Search
School Logo
Files
File List
27936.pdf (1.18 MB)
ETD Abstract Container
Abstract Header
Classification of Patterns in Streaming Data Using Clustering Signatures
Author Info
Awodokun, Olugbenga
ORCID® Identifier
http://orcid.org/0000-0002-2714-5221
Permalink:
http://rave.ohiolink.edu/etdc/view?acc_num=ucin1504880155623189
Abstract Details
Year and Degree
2017, MS, University of Cincinnati, Engineering and Applied Science: Electrical Engineering.
Abstract
Streaming datasets often pose a myriad of challenges for machine learning algorithms, some of which include insufficient storage and changes in the underlying distributions of the data during different time intervals. This thesis proposes a hierarchical clustering based method (unsupervised learning) for determining signatures of data in a time window and thus building a classifier based on the match between the observed clusters and known patterns of clustering. When new clusters are observed, they are added to the collection of possible global list of clusters, used to generate a signature for data in a time window. Dendrograms are created from each time window, and their clusters were compared to a global list of clusters. The global clusters list is only updated if none of the existing global clusters that can model data points in any later time window. The global clusters were then used in the testing phase to classify novel data chunks according to their Tanimoto similarities. Although the training samples were only taken from 20% of the entire KDD Cup 99 dataset, we validated our approach by using test data from different regions of the datasets at multiple intervals and the classifier performance achieved was comparable to other methods that had used the entire datasets for training.
Committee
Raj Bhatnagar, Ph.D. (Committee Chair)
Gowtham Atluri (Committee Member)
Nan Niu, Ph.D. (Committee Member)
Pages
70 p.
Subject Headings
Computer Science
Keywords
data mining
;
hierarchical clustering
;
unsupervised learning
;
data analytics
;
machine learning
;
intrusion detection
Recommended Citations
Refworks
EndNote
RIS
Mendeley
Citations
Awodokun, O. (2017).
Classification of Patterns in Streaming Data Using Clustering Signatures
[Master's thesis, University of Cincinnati]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1504880155623189
APA Style (7th edition)
Awodokun, Olugbenga.
Classification of Patterns in Streaming Data Using Clustering Signatures.
2017. University of Cincinnati, Master's thesis.
OhioLINK Electronic Theses and Dissertations Center
, http://rave.ohiolink.edu/etdc/view?acc_num=ucin1504880155623189.
MLA Style (8th edition)
Awodokun, Olugbenga. "Classification of Patterns in Streaming Data Using Clustering Signatures." Master's thesis, University of Cincinnati, 2017. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1504880155623189
Chicago Manual of Style (17th edition)
Abstract Footer
Document number:
ucin1504880155623189
Download Count:
423
Copyright Info
© 2017, some rights reserved.
Classification of Patterns in Streaming Data Using Clustering Signatures by Olugbenga Awodokun is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License. Based on a work at etd.ohiolink.edu.
This open access ETD is published by University of Cincinnati and OhioLINK.