Skip to Main Content
 

Global Search Box

 
 
 
 

ETD Abstract Container

Abstract Header

A Contrast Pattern based Clustering Algorithm for Categorical Data

Fore, Neil Koberlein

Abstract Details

2010, Master of Science (MS), Wright State University, Computer Science.
The data clustering problem has received much attention in the data mining, machine learning, and pattern recognition communities over a long period of time. Many previous approaches to solving this problem require the use of a distance function. However, since clustering is highly explorative and is usually performed on data which are rather new, it is debatable whether users can provide good distance functions for the data. This thesis proposes a Contrast Pattern based Clustering (CPC) algorithm to construct clusters without a distance function, by focusing on the quality and diversity/richness of contrast patterns that contrast the clusters in a clustering. Specifically, CPC attempts to maximize the Contrast Pattern based Clustering Quality (CPCQ) index, which can recognize that expert-determined classes are the best clusters for many datasets in the UCI Repository. Experiments using UCI datasets show that CPCQ scores are higher for clusterings produced by CPC than those by other, well-known clustering algorithms. Furthermore, CPC is able to recover expert clusterings from these datasets with higher accuracy than those algorithms.
Guozhu Dong, PhD (Advisor)
Keke Chen, PhD (Committee Member)
Krishnaprasad Thirunarayan, PhD (Committee Member)
46 p.

Recommended Citations

Citations

  • Fore, N. K. (2010). A Contrast Pattern based Clustering Algorithm for Categorical Data [Master's thesis, Wright State University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=wright1285345623

    APA Style (7th edition)

  • Fore, Neil. A Contrast Pattern based Clustering Algorithm for Categorical Data. 2010. Wright State University, Master's thesis. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=wright1285345623.

    MLA Style (8th edition)

  • Fore, Neil. "A Contrast Pattern based Clustering Algorithm for Categorical Data." Master's thesis, Wright State University, 2010. http://rave.ohiolink.edu/etdc/view?acc_num=wright1285345623

    Chicago Manual of Style (17th edition)