Skip to Main Content
 

Global Search Box

 
 
 
 

Files

File List

ETD Abstract Container

Abstract Header

High-risk Patient Identification: Patient Similarity, Missing Data Analysis, and Pattern Visualization

Yaddanapudi, Suryanarayana

Abstract Details

2016, MS, University of Cincinnati, Engineering and Applied Science: Mechanical Engineering.
The amount of data being collected every day is enormous and well documented. Worlds of banking, retail, commerce and government have changed with time and transformed their ways but healthcare domain is still underway to reach out to the public by making use of this vastly increasing supply of information. Researchers in medical field are stuck to traditional data mining algorithms in improving the quality of the service being delivered. One such application is, identifying the high-risk patients who are suffering from severe long-lasting or recurrent illness. Traditional epidemiologists always considered the task of identifying high-risk population as a classification problem and most of them either used logistic regression, decision trees or association rules for classifying those patients from higher risk to lower risk. Classification techniques used in identifying high-risk patients don’t consider the group of patients with similar characteristics but instead identify them as high-risk in isolation. To address this problem, firstly, this paper introduces a new method to estimate the disease occurrence rate (or occurrence probability or probability of a patient being at high risk) based on the Bernoulli distribution. Occurrence probabilities are calculated considering the group of similar patients instead of considering them alone. Missing data is common in every domain and instances with missing data might carry significant information. Ignoring them while building a model will have a cascading effect on the final model built. Secondly, to address this problem, we used a probability based approach to compute fractional patient counts instead of just ignoring those data instances or applying some kind of imputation techniques. Finally, the proposed method is tested against traditional methods like logistic regression, decision trees and association rule mining and results obtained are shown to have significant improvement.
Hongdao Huang, Ph.D. (Committee Chair)
Anil Jegga, D.V.M. M.Res. (Committee Member)
David Thompson, Ph.D. (Committee Member)
71 p.

Recommended Citations

Citations

  • Yaddanapudi, S. (2016). High-risk Patient Identification: Patient Similarity, Missing Data Analysis, and Pattern Visualization [Master's thesis, University of Cincinnati]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1460731557

    APA Style (7th edition)

  • Yaddanapudi, Suryanarayana. High-risk Patient Identification: Patient Similarity, Missing Data Analysis, and Pattern Visualization. 2016. University of Cincinnati, Master's thesis. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=ucin1460731557.

    MLA Style (8th edition)

  • Yaddanapudi, Suryanarayana. "High-risk Patient Identification: Patient Similarity, Missing Data Analysis, and Pattern Visualization." Master's thesis, University of Cincinnati, 2016. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1460731557

    Chicago Manual of Style (17th edition)