Skip to Main Content
 

Global Search Box

 
 
 
 

ETD Abstract Container

Abstract Header

Partial EM Procedure for Big-Data Linear Mixed Effects Model, and Generalized PPE for High-Dimensional Data in Julia

Abstract Details

2018, Doctor of Philosophy, Case Western Reserve University, Epidemiology and Biostatistics.
Methodologically, this dissertation contributes to two areas in Statistics: Linear mixed effects models for big data and Test of equal covariance for high-dimensional data. Scientifically, this dissertation helps to comprehensively evaluate the effect of the Specialty Care Access Network-Extension for Community Healthcare Outcomes (SCAN-ECHO) training on primary care providers at outpatient clinics in treating diabetes for the VA patient population. In the first part of this dissertation, we introduce three challenges and offer solutions to each, in examining the effect of SCAN-ECHO training on VA diabetic patients. The first challenge was data curation for longitudinal variables. As a solution, we developed an R-function called "fusion"' customized to our data structure for effective data curation. The second challenge was measurement variability and heterogeneity of the population. Different types of summary measures were used to reduce the variability of the outcome. Longitudinal cluster analysis was conducted to identify similar subgroups among the heterogeneous population. The third challenge was fitting linear mixed effects model for big data that could not be imported to R because the data exceeded the memory capacity. As a solution, we proposed a new modern approach to Big-data Linear Mixed Effects Model (bLMM) using a Partial EM (PEM) algorithm and data partitioning. Our PEM procedure was developed to analyze the effect of SCAN-ECHO training on diabetes treatment but this analytic approach is of interest by itself (statistical contribution 1) because the PEM is a general procedure for fitting LMM for big data. We evaluated the performance of bLMM PEM by comparing PEM to the following three methods for fitting LMM: Broyden-Fletcher-Goldfarb-Shanno (BFGS) algorithm using the entire data, full EM using the entire data, and meta analysis using data partitions. Finally, for implementation, we applied our PEM procedure to evaluate the effect of SCAN-ECHO training for diabetes treatment. In the second part of this dissertation, improvement in the optimization algorithm for Projection Pursuit Ellipse (PPE) to test for equal variance in high-dimensional data is introduced (statistical contribution 2). Many standard multivariate techniques were developed based on the assumption that the covariance matrices from different groups are equal. A well-known test for the equality of covariance is the Bartlett’s test. However, the Bartlett’s test is only a function of the volumes of covariance matrices, which does not account for the shapes and orientations of the matrices. In this work we developed a Projection Pursuit Ellipses procedure for high-dimensional data (hPPE) and compared its performance to the Bartlett’s test and a modern benchmark for high dimensional p data.
Jiayang Sun (Advisor)
Jeffrey Albert (Committee Chair)
Mark Schluchter (Committee Member)
David Aron (Committee Member)
Yifan Xu (Committee Member)
117 p.

Recommended Citations

Citations

  • Cho, J. I. (2018). Partial EM Procedure for Big-Data Linear Mixed Effects Model, and Generalized PPE for High-Dimensional Data in Julia [Doctoral dissertation, Case Western Reserve University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=case152845439167999

    APA Style (7th edition)

  • Cho, Jang Ik. Partial EM Procedure for Big-Data Linear Mixed Effects Model, and Generalized PPE for High-Dimensional Data in Julia. 2018. Case Western Reserve University, Doctoral dissertation. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=case152845439167999.

    MLA Style (8th edition)

  • Cho, Jang Ik. "Partial EM Procedure for Big-Data Linear Mixed Effects Model, and Generalized PPE for High-Dimensional Data in Julia." Doctoral dissertation, Case Western Reserve University, 2018. http://rave.ohiolink.edu/etdc/view?acc_num=case152845439167999

    Chicago Manual of Style (17th edition)