Skip to Main Content
 

Global Search Box

 
 
 
 

Files

ETD Abstract Container

Abstract Header

Linear Approximations for Second Order High Dimensional Model Representation of the Log Likelihood Ratio

Abstract Details

2019, Master of Science, Ohio State University, Mathematics.
Many computational biology and bioinformatics applications aim to develop mathematical models that describe biological features and mechanisms that are effected in complex diseases such as cancer, and can be further utilized for diagnosis, prognosis, population screening, personalized treatment, etc. When formulated as a binary classification problem, the goal is to develop glass box models that can be used to predict the unknown label of a new observation and can explain which key patterns of the new observation motivates a specific prediction. Not only are the biological mechanisms affected in complex diseases, such as cancer, highly complex, heterogeneous, and with many interacting components, but also mathematical modeling is typically exacerbated by small sample sizes. Therefore, it is desired to develop low dimensional glass-box models that grasp the key factors affecting the underlying biology. Such first step analysis can be used to further identify factors and patterns that affect the underlying biology, which can be further used to make meaningful hypotheses and study the biological mechanisms affected in the disease. Additionally, the developed model can be used to design tests that predict the underlying label, for instance healthy versus cancerous, of a new observation. High dimensional model representation (HDMR) is a recently proposed framework that describes a function of a random vector based on partial observations of the underlying vector. Here we study how this framework can be used to develop low dimensional models that approximate the log likelihood ratio in binary classification problems, suitable for small-sample high-dimensional problems. Due to challenges of obtaining the ``best'' underlying low dimensional representation we develop linear approximations for the second order HDMR expansion, which considers the effect of each feature and pairwise feature interactions on class labels. We develop approximations for continuous and categorical observations, suitable for analyzing gene expression and single nucleotide polymorphism (SNP) data.
Grzegorz Rempala (Advisor)
Chuan Xue (Committee Member)
116 p.

Recommended Citations

Citations

  • Foroughi pour, A. (2019). Linear Approximations for Second Order High Dimensional Model Representation of the Log Likelihood Ratio [Master's thesis, Ohio State University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=osu1555419601408423

    APA Style (7th edition)

  • Foroughi pour, Ali. Linear Approximations for Second Order High Dimensional Model Representation of the Log Likelihood Ratio. 2019. Ohio State University, Master's thesis. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=osu1555419601408423.

    MLA Style (8th edition)

  • Foroughi pour, Ali. "Linear Approximations for Second Order High Dimensional Model Representation of the Log Likelihood Ratio." Master's thesis, Ohio State University, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=osu1555419601408423

    Chicago Manual of Style (17th edition)