MINING STRUCTURED SETS OF SUBSPACES FROM HIGH DIMENSIONAL DATA

RAJSHIVA, ANSHUMAAN

Keyword Search

School Logo

ucin1085667702.pdf (543.18 KB)

MINING STRUCTURED SETS OF SUBSPACES FROM HIGH DIMENSIONAL DATA

Author Info

RAJSHIVA, ANSHUMAAN

Permalink:

http://rave.ohiolink.edu/etdc/view?acc_num=ucin1085667702

Year and Degree

2004, MS, University of Cincinnati, Engineering : Computer Science.

Abstract

Data mining is the process of extracting possibly unknown and potentially useful information from databases. Data mining algorithms are used in many applications in the domains of business, engineering, sciences, and social databases. Among many methodologies existing for data mining, clustering techniques are one of the most frequently used ones. Clustering refers to the process of formation of a number of groups of data points based on their similarity. Finding clusters in a high dimensional dataspace is challenging because a high dimensional dataspace has hundreds of attributes and hundreds of data tuples and the average density of data points is very low. The distance functions used by many conventional algorithms fail in this scenario. Agrawal et al. [2] proved that if clusters do not exist in the original high dimensional dataspace, it may be possible that clusters exist in some subspaces of the original dataspace. A subspace is formed by a subset of all attributes and a subset of data tuples taken together. The choices for the subsets are made in such a way that clusters of the data points exist in the subspace. Subspace clustering identifies such subspace clusters. In this thesis, we discuss a novel approach to identify subspace clusters by first identifying complete subspaces. A Complete Subspace is defined as a subspace which contains exactly one cluster formed by all the data tuples included in that subspace. We develop an algorithm to discover and identify such complete subspaces. In our algorithm, complete subspaces are identified based on a similarity function. A similarity function is a symmetric mathematical function that measures the similarity between two data values of an attribute. We discuss different similarity functions and apply them to the datasets belonging to each of the identified application domains of bioinformatics, graphs and citation datasets. Through experiments, we analyze and interpret the nature of the subspace clusters in correlation with the applied similarity function and the application domain. Our algorithm is exhaustive in nature and discovers all the complete subspaces in a high dimensional dataspace.

Committee

Dr. Raj Bhatnagar (Advisor)

Pages

122 p.

Keywords

Datamining; Subspace Clustering; Complete Subspace

RAJSHIVA, A. (2004). MINING STRUCTURED SETS OF SUBSPACES FROM HIGH DIMENSIONAL DATA [Master's thesis, University of Cincinnati]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1085667702
APA Style (7th edition)
RAJSHIVA, ANSHUMAAN. MINING STRUCTURED SETS OF SUBSPACES FROM HIGH DIMENSIONAL DATA. 2004. University of Cincinnati, Master's thesis. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=ucin1085667702.
MLA Style (8th edition)
RAJSHIVA, ANSHUMAAN. "MINING STRUCTURED SETS OF SUBSPACES FROM HIGH DIMENSIONAL DATA." Master's thesis, University of Cincinnati, 2004. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1085667702
Chicago Manual of Style (17th edition)

Document number:

ucin1085667702

Download Count:

1,087

Copyright Info

Global Search Box

Files

File List

ETD Abstract Container

Abstract Header

MINING STRUCTURED SETS OF SUBSPACES FROM HIGH DIMENSIONAL DATA

Abstract Details

Recommended Citations

Citations

Abstract Footer

Global Footer

Ohio Department of Higher Education

State Government Links

Education Links

Global Search Box

Files

File List

ETD Abstract Container

Abstract Header

MINING STRUCTURED SETS OF SUBSPACES FROM HIGH DIMENSIONAL DATA

Abstract Details

Recommended CitationsRefworksEndNoteRISMendeley

Citations

Abstract Footer

Global Footer

Ohio Department of Higher Education

State Government Links

Education Links

Recommended Citations