Skip to Main Content
Frequently Asked Questions
Submit an ETD
Global Search Box
Need Help?
Keyword Search
Participating Institutions
Advanced Search
School Logo
Files
File List
Thesis - Ehsan Aghaei - Final version.pdf (4.54 MB)
ETD Abstract Container
Abstract Header
Machine Learning for Host-based Misuse and Anomaly Detection in UNIX Environment
Author Info
Aghaei, Ehsan
Permalink:
http://rave.ohiolink.edu/etdc/view?acc_num=toledo1493255965690437
Abstract Details
Year and Degree
2017, Master of Science, University of Toledo, Engineering (Computer Science).
Abstract
This thesis focuses on three individual studies about intrusion detection systems using different pre-processing techniques and classifiers on ADFA-LD dataset. ADFA-LD entails thousands of systems call traces, which are collected during seven different situations including normal and six types of attack in the UNIX environment. First study presents development and application of a frequency-based misuse intrusion detection system which is accomplished through an ensemble classification. It entails preprocessing the raw ADFA-LD system call traces with N-gram feature extraction methodology, and generating fixed size patterns whose attributes are N-grams for N value in the range 1 to 10 for training and testing. In order to generate the signature of each class and to reduce the dimensionality, we filtered the features in two steps; selecting the most frequent unique attributes, and picking the most frequent features regardless of uniqueness. The five-random-neighbor SMOTE algorithm is used to balance the classes in terms of pattern counts. The classifier design is based on majority voting ensemble with base classifiers of naive Bayes, support vector machine, PART, decision tree and random forest as they are implemented in the Weka machine-learning framework. The proposed misuse detection system demonstrated very high performance in detecting attacks. In the second study, the misuse detection system employs ADFA-LD system call traces to extract features using principal components analysis (PCA). In this study, fixed size patterns for both training and testing, namely Eigentraces, are generated by preprocessing the ADFA-LD system call traces with the PCA methodology. Eigentraces serve as templates for known normal and attack class traces. Classification of system call trace data that is in the form of feature vectors is accomplished using the k-nearest-neighbor algorithm. A simulation study was conducted to evaluate the performance of the proposed system. The proposed misuse intrusion detection system demonstrated very high performance in detecting attacks and predicting the type of the attacks given that there were six classes of attacks, and as such, appears very promising. In the third study, we modeled a host-based anomaly detection system within the framework of one-class classification using the ADFA-LD dataset. Pre-processing and feature extraction procedures employed windowing on the system-call trace data followed by the application of PCA-based Eigentraces technique. The target or normal class probability function is modeled by two separate machine learners: Radial Basis Function neural network and Random Forest. The normal class density function is estimated using Bayes’ theorem. A simulation study showed that the proposed intrusion detection system offers high performance in detecting anomalies and normal activities accurately.
Committee
Gursel Serpen (Committee Chair)
Henry Ledgard (Committee Member)
Ahmad Y. Javaid (Committee Member)
Pages
116 p.
Subject Headings
Computer Engineering
;
Computer Science
Recommended Citations
Refworks
EndNote
RIS
Mendeley
Citations
Aghaei, E. (2017).
Machine Learning for Host-based Misuse and Anomaly Detection in UNIX Environment
[Master's thesis, University of Toledo]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=toledo1493255965690437
APA Style (7th edition)
Aghaei, Ehsan.
Machine Learning for Host-based Misuse and Anomaly Detection in UNIX Environment.
2017. University of Toledo, Master's thesis.
OhioLINK Electronic Theses and Dissertations Center
, http://rave.ohiolink.edu/etdc/view?acc_num=toledo1493255965690437.
MLA Style (8th edition)
Aghaei, Ehsan. "Machine Learning for Host-based Misuse and Anomaly Detection in UNIX Environment." Master's thesis, University of Toledo, 2017. http://rave.ohiolink.edu/etdc/view?acc_num=toledo1493255965690437
Chicago Manual of Style (17th edition)
Abstract Footer
Document number:
toledo1493255965690437
Download Count:
4,586
Copyright Info
© 2017, some rights reserved.
Machine Learning for Host-based Misuse and Anomaly Detection in UNIX Environment by Ehsan Aghaei is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License. Based on a work at etd.ohiolink.edu.
This open access ETD is published by University of Toledo and OhioLINK.