Skip to Main Content
 

Global Search Box

 
 
 
 

Files

ETD Abstract Container

Abstract Header

The Impact of Data Imputation Methodologies on Knowledge Discovery

Brown, Marvin Lane

Abstract Details

2008, Doctor of Business Administration, Cleveland State University, Nance College of Business Administration.

The purpose of this research is to investigate the impact of Data Imputation Methodologies that are employed when a specific Data Mining algorithm is utilized within a KDD (Knowledge Discovery in Databases) process. This study will employ certain Knowledge Discovery processes that are widely accepted in both the academic and commercial worlds. Several Knowledge Discovery models will be developed utilizing secondary data containing known correct values. Tests will be conducted on the secondary data both before and after storing data instances with known results and then identifying imprecise data values. One of the integral stages in the accomplishment of successful Knowledge Discovery is the Data Mining phase. The actual Data Mining process deals significantly with prediction, estimation, classification, pattern recognition and the development of association rules. Neural Networks are the most commonly selected tools for Data Mining classification and prediction. Neural Networks employ various types of Transfer Functions when outputting data. The most commonly employed Transfer Function is the s-Sigmoid Function. Various Knowledge Discovery Models from various research and business disciplines were tested using this framework.

However, missing and inconsistent data has been pervasive problems in the history of data analysis since the origin of data collection. Due to advancements in the capacities of data storage and the proliferation of computer software, more historical data is being collected and analyzed today than ever before. The issue of missing data must be addressed, since ignoring this problem can introduce bias into the models being evaluated and lead to inaccurate data mining conclusions. The objective of this research is to address the impact of Missing Data and Data Imputation on the Data Mining phase of Knowledge Discovery when Neural Networks are utilized when employing an s-Sigmoid Transfer function, and are confronted with Missing Data and Data Imputation methodologies.

Chien-Hua (Mike) Lin, Phd (Committee Chair)
Adam Fadlalla, Phd (Committee Member)
Walter Rom, Phd (Committee Member)
John Kros, Phd (Committee Member)
Marc Lynn, Phd (Advisor)
141 p.

Recommended Citations

Citations

  • Brown, M. L. (2008). The Impact of Data Imputation Methodologies on Knowledge Discovery [Doctoral dissertation, Cleveland State University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=csu1227054769

    APA Style (7th edition)

  • Brown, Marvin. The Impact of Data Imputation Methodologies on Knowledge Discovery. 2008. Cleveland State University, Doctoral dissertation. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=csu1227054769.

    MLA Style (8th edition)

  • Brown, Marvin. "The Impact of Data Imputation Methodologies on Knowledge Discovery." Doctoral dissertation, Cleveland State University, 2008. http://rave.ohiolink.edu/etdc/view?acc_num=csu1227054769

    Chicago Manual of Style (17th edition)