This thesis investigates the performance of four different classifiers on a common real data set. A review of the current classification models is presented along with their advantages and limitations. Four approaches to classifier design, a fuzzy set-based approach, neural network approach, support vector machine, and minimum distance classifier were implemented.
The data set used for comparing the performance of these classifiers consists of 4,232 samples. Its characteristics, such as high variability between samples within the same category and overlap between between categories, pose serious challenges for designing the classifier.
Several criteria are considered as a basis for evaluating a classifier performance, including the generalization power, the learning curve and ROC points. For classifier comparison, measures of diversity such as Yule Q statistic and the coefficient of correlation are used. Results of the evaluations are presented and analyzed in the light of the characteristics of the data set considered.