Skip to Main Content
 

Global Search Box

 
 
 
 

Files

ETD Abstract Container

Abstract Header

Knowledge Accelerated Algorithms and the Knowledge Cache

Goyder, Matthew

Abstract Details

2012, Master of Science, Ohio State University, Computer Science and Engineering.

Knowledge discovery through data mining is the process of automatically extract- ing actionable information from data, that is, the information or knowledge found within data which provides insight beyond that which may be found by observing the cardinal state of the data itself. This process is human driven; there is always a human at the core.

Knowledge discovery is inherently iterative, a human discovers information by posing questions to a data mining system, which in turn provides answers. New questions are developed upon receipt of these answers and these new questions are asked. Clearly these answers need to be provided in as timely a fashion as possible in order for the human at the core to form ideas and solidify hypotheses. Unfortunately many questions take too long to be answered to be useful to the human. Is there anything we can do to speed up the response to these questions if the answer is based in part upon answers previously provided?

What we can do is when a query (question) is submitted (asked) to a data mining system, we can store the result (answer) as well as information about the result in a cache and then re-use this information to help respond to the next query in a more timely fashion. If a query partially contains a result which was found in the past, we can combine this information with new information to provide the result much faster than if we were to re-run a query incorporating no prior information.

This thesis explores this idea by introducing a high performance information cache called a Knowledge Cache with remote access capabilities, as well as a programming model and API for clients to both store, query, share and retrieve knowledge objects from within it. These knowledge objects can then be used in conjunction with a modified data mining algorithm to reduce query processing time for new queries where prior information is useful. We explain the usage model of the Knowledge Cache and API, as well as demonstrate performance gains by using the Knowledge Cache in the context of two classic data mining algorithms: k-means clustering and frequent itemset mining.

Srinivasan Parthasarathy, PhD (Advisor)
Gagan Agrawal, PhD (Committee Member)

Recommended Citations

Citations

  • Goyder, M. (2012). Knowledge Accelerated Algorithms and the Knowledge Cache [Master's thesis, Ohio State University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=osu1339763385

    APA Style (7th edition)

  • Goyder, Matthew. Knowledge Accelerated Algorithms and the Knowledge Cache. 2012. Ohio State University, Master's thesis. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=osu1339763385.

    MLA Style (8th edition)

  • Goyder, Matthew. "Knowledge Accelerated Algorithms and the Knowledge Cache." Master's thesis, Ohio State University, 2012. http://rave.ohiolink.edu/etdc/view?acc_num=osu1339763385

    Chicago Manual of Style (17th edition)