Memory- and knowledge-conscious data mining

Ghoting, Amol

Keyword Search

School Logo

osu1186979749.pdf (2.24 MB)

Memory- and knowledge-conscious data mining

Author Info

Ghoting, Amol

Permalink:

http://rave.ohiolink.edu/etdc/view?acc_num=osu1186979749

Year and Degree

2007, Doctor of Philosophy, Ohio State University, Computer and Information Science.

Abstract

Advances in data collection and storage technologies have allowed organizations to collect increasing amounts of data. Spurred by these advances, the field of knowledge discovery in databases has emerged. The main challenge in the knowledge discovery process is to extract knowledge and insight from massive datasets in a fast and efficient manner. This process is iterative and involves a human in the loop. Therefore, to facilitate effective data understanding, it is imperative that one minimizes response-time to a user's query. To address this challenge, research efforts have largely focused on reducing the computation required to process a single data mining query. However, simply pursuing this direction is insufficient. In this dissertation, we explore two new directions to improve the performance of data mining algorithms. The first direction attempts to improve performance by understanding and improving the memory system performance of data mining algorithms. The second direction attempts to improve performance by redesigning a data mining algorithm such that it reuses computation. In the context of memory-conscious data mining, first, we present results of our study that delves into the memory system performance of data mining algorithms that are designed to operate over static datasets. Second, using the knowledge gleaned in the above investigation, we look at improving the cache performance of frequent pattern mining algorithms. We expect that the presented methodology will be useful in improving the performance of other data mining algorithms as well. Third, a scheduling scheme that is cognizant of the trade-off between response-time and memory usage, when processing and mining data streams, is presented. This scheme allows us to better use the memory system when mining distributed data streams. In the context of knowledge-conscious data mining, first, we show how one can redesign exploratory kMeans clustering such that it can expose and reuse repeated computation across iterations of a single kMeans query and multiple kMeans queries. Second, we present the design of a knowledge caching service for data mining algorithms. This service is easy to use, scalable, and allows for the reuse of computation across multiple users of a data mining system.

Committee

Srinivasan Parthasarathy (Advisor)

Subject Headings

Computer Science

Ghoting, A. (2007). Memory- and knowledge-conscious data mining [Doctoral dissertation, Ohio State University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=osu1186979749
APA Style (7th edition)
Ghoting, Amol. Memory- and knowledge-conscious data mining. 2007. Ohio State University, Doctoral dissertation. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=osu1186979749.
MLA Style (8th edition)
Ghoting, Amol. "Memory- and knowledge-conscious data mining." Doctoral dissertation, Ohio State University, 2007. http://rave.ohiolink.edu/etdc/view?acc_num=osu1186979749
Chicago Manual of Style (17th edition)

Document number:

osu1186979749

Download Count:

1,039

Copyright Info

Global Search Box

Files

File List

ETD Abstract Container

Abstract Header

Memory- and knowledge-conscious data mining

Abstract Details

Recommended Citations

Citations

Abstract Footer

Global Footer

Ohio Department of Higher Education

State Government Links

Education Links

Global Search Box

Files

File List

ETD Abstract Container

Abstract Header

Memory- and knowledge-conscious data mining

Abstract Details

Recommended CitationsRefworksEndNoteRISMendeley

Citations

Abstract Footer

Global Footer

Ohio Department of Higher Education

State Government Links

Education Links

Recommended Citations