High-risk Patient Identification: Patient Similarity, Missing Data Analysis, and Pattern Visualization

Yaddanapudi, Suryanarayana

Keyword Search

School Logo

20937.pdf (4.7 MB)

High-risk Patient Identification: Patient Similarity, Missing Data Analysis, and Pattern Visualization

Author Info

Yaddanapudi, Suryanarayana

Permalink:

http://rave.ohiolink.edu/etdc/view?acc_num=ucin1460731557

Year and Degree

2016, MS, University of Cincinnati, Engineering and Applied Science: Mechanical Engineering.

Abstract

The amount of data being collected every day is enormous and well documented. Worlds of banking, retail, commerce and government have changed with time and transformed their ways but healthcare domain is still underway to reach out to the public by making use of this vastly increasing supply of information. Researchers in medical field are stuck to traditional data mining algorithms in improving the quality of the service being delivered. One such application is, identifying the high-risk patients who are suffering from severe long-lasting or recurrent illness. Traditional epidemiologists always considered the task of identifying high-risk population as a classification problem and most of them either used logistic regression, decision trees or association rules for classifying those patients from higher risk to lower risk. Classification techniques used in identifying high-risk patients don’t consider the group of patients with similar characteristics but instead identify them as high-risk in isolation. To address this problem, firstly, this paper introduces a new method to estimate the disease occurrence rate (or occurrence probability or probability of a patient being at high risk) based on the Bernoulli distribution. Occurrence probabilities are calculated considering the group of similar patients instead of considering them alone. Missing data is common in every domain and instances with missing data might carry significant information. Ignoring them while building a model will have a cascading effect on the final model built. Secondly, to address this problem, we used a probability based approach to compute fractional patient counts instead of just ignoring those data instances or applying some kind of imputation techniques. Finally, the proposed method is tested against traditional methods like logistic regression, decision trees and association rule mining and results obtained are shown to have significant improvement.

Committee

Hongdao Huang, Ph.D. (Committee Chair)
Anil Jegga, D.V.M. M.Res. (Committee Member)
David Thompson, Ph.D. (Committee Member)

Pages

71 p.

Subject Headings

Mechanics

Keywords

High Risk Patients; Occurrence Probability; Logistic Regression; Decision Trees

Yaddanapudi, S. (2016). High-risk Patient Identification: Patient Similarity, Missing Data Analysis, and Pattern Visualization [Master's thesis, University of Cincinnati]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1460731557
APA Style (7th edition)
Yaddanapudi, Suryanarayana. High-risk Patient Identification: Patient Similarity, Missing Data Analysis, and Pattern Visualization. 2016. University of Cincinnati, Master's thesis. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=ucin1460731557.
MLA Style (8th edition)
Yaddanapudi, Suryanarayana. "High-risk Patient Identification: Patient Similarity, Missing Data Analysis, and Pattern Visualization." Master's thesis, University of Cincinnati, 2016. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1460731557
Chicago Manual of Style (17th edition)

Document number:

ucin1460731557

Download Count:

316

Copyright Info

Global Search Box

Files

File List

ETD Abstract Container

Abstract Header

High-risk Patient Identification: Patient Similarity, Missing Data Analysis, and Pattern Visualization

Abstract Details

Recommended Citations

Citations

Abstract Footer

Global Footer

Ohio Department of Higher Education

State Government Links

Education Links

Global Search Box

Files

File List

ETD Abstract Container

Abstract Header

High-risk Patient Identification: Patient Similarity, Missing Data Analysis, and Pattern Visualization

Abstract Details

Recommended CitationsRefworksEndNoteRISMendeley

Citations

Abstract Footer

Global Footer

Ohio Department of Higher Education

State Government Links

Education Links

Recommended Citations