Skip to Main Content
 

Global Search Box

 
 
 
 

ETD Abstract Container

Abstract Header

A Hot Deck Imputation Procedure for Multiply Imputing Nonignorable Missing Data: The Proxy Pattern-Mixture Hot Deck

Sullivan, Danielle M

Abstract Details

2014, Doctor of Philosophy, Ohio State University, Biostatistics.
Hot deck imputation is a common method for handling item nonresponse in surveys, but most implementations assume data are missing at random (MAR). We propose a new hot deck method for imputation of a partially missing outcome variable that does not assume data are MAR. We use a parametric model to create predicted means for both donors and donees under varying assumptions on the missing data mechanism, ranging from MAR to missing not at random (MNAR). When imputing a continuous outcome variable, for a given assumption on the missingness mechanism, the predicted means are used to define distances between donors and donees and probabilities of selection proportional to those distances. Multiple imputation using the hot deck is performed to create a set of completed data sets, using an approximate Bayesian bootstrap to ensure ``proper'' imputations. This new hot deck method creates an intuitive sensitivity analysis where imputations may be performed under MAR and under varying MNAR mechanisms, and the resulting impact on inference can be evaluated. In addition, we propose two donor quality metrics to identify situations where close matches of donor to donee are not available, which can occur under strong MNAR assumptions. We investigate bias and coverage of estimates from our proposed method through simulation and apply the method to estimation of income in the Ohio Medicaid Assessment Survey. We extend the proposed hot deck method for multiple imputation of a binary outcome variable by assuming there exists a continuous latent variable that determines the value of the binary outcome. This allows us to use the framework developed under a continuous outcome to create predicted means assuming different missingness mechanisms. However, because the latent variable is by definition unobserved, additional steps are required to obtain the parameter estimates used in creating the predicted means and we compare two approaches of estimation. Furthermore, we modify donor selection by implementing an adjustment cell procedure. We investigate bias and coverage of estimates from our proposed method through simulation and study the sensitivity to normality. We apply the method to estimation of mean ER+ status in the Surveillance, Epidemiology, and End Results Program. In addition, we illustrate how the method can be applied to estimate regression coefficients.
Rebecca Andridge (Advisor)
Bo Lu (Committee Member)
Eloise Kaizar (Committee Member)
Elizabeth Stasny (Committee Member)
133 p.

Recommended Citations

Citations

  • Sullivan, D. M. (2014). A Hot Deck Imputation Procedure for Multiply Imputing Nonignorable Missing Data: The Proxy Pattern-Mixture Hot Deck [Doctoral dissertation, Ohio State University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=osu1387301284

    APA Style (7th edition)

  • Sullivan, Danielle. A Hot Deck Imputation Procedure for Multiply Imputing Nonignorable Missing Data: The Proxy Pattern-Mixture Hot Deck. 2014. Ohio State University, Doctoral dissertation. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=osu1387301284.

    MLA Style (8th edition)

  • Sullivan, Danielle. "A Hot Deck Imputation Procedure for Multiply Imputing Nonignorable Missing Data: The Proxy Pattern-Mixture Hot Deck." Doctoral dissertation, Ohio State University, 2014. http://rave.ohiolink.edu/etdc/view?acc_num=osu1387301284

    Chicago Manual of Style (17th edition)