Skip to Main Content
 

Global Search Box

 
 
 
 

ETD Abstract Container

Abstract Header

Ranked set sampling for binary and ordered categorical variables with applications in health survey data

Chen, Haiying

Abstract Details

2004, Doctor of Philosophy, Ohio State University, Biostatistics.
Ranked set sampling (RSS) is a sampling procedure that can be considerably more efficient than simple random sampling. It involves preliminary ranking of the variable of interest to aid in sample selection. Although ranking processes for continuous variables have been studied extensively in the literature, the use of RSS in the case of a binary variable has not been investigated thoroughly. We investigate the application of RSS to estimation of a population proportion theoretically and empirically using a National Health and Nutrition Examination Survey III (NHANES III) data set. We propose the use of logistic regression to aid in the ranking of the binary variable of interest. Our results indicate that this use of logistic regression leads to substantial gains in precision for estimation of a population proportion. Further, we illustrate how data from one source can be used to construct the necessary logistic regression equation, which can, in turn, be used to estimate the relevant proportions in a second group of subjects for which the same predictor variables are available. The results indicate the extent to which the sample size required to achieve a desired precision is reduced. Balanced RSS, however, is not in general optimal in terms of variance reduction. We investigate the application of unbalanced RSS to estimation of a population proportion. In particular, Neyman allocation is shown to be optimal for this setting. Further, we provide methods to obtain estimators for the probabilities of success for the various judgment order statistics under either perfect or imperfect rankings so that Neyman allocation can be implemented. Finally, we extend the application of RSS, both balanced and unbalanced, to ordered categorical variables with the goal of estimating the probabilities of all categories. We use ordinal logistic regression to aid in the ranking of the ordinal variable of interest. We also propose an optimal allocation scheme and methods for implementing it under either perfect or imperfect rankings. The results indicate that the use of ordinal logistic regression in ranking leads to substantial gains in precision for estimation of population proportions.
Elizabeth Stasny (Advisor)
109 p.

Recommended Citations

Citations

  • Chen, H. (2004). Ranked set sampling for binary and ordered categorical variables with applications in health survey data [Doctoral dissertation, Ohio State University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=osu1092770729

    APA Style (7th edition)

  • Chen, Haiying. Ranked set sampling for binary and ordered categorical variables with applications in health survey data. 2004. Ohio State University, Doctoral dissertation. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=osu1092770729.

    MLA Style (8th edition)

  • Chen, Haiying. "Ranked set sampling for binary and ordered categorical variables with applications in health survey data." Doctoral dissertation, Ohio State University, 2004. http://rave.ohiolink.edu/etdc/view?acc_num=osu1092770729

    Chicago Manual of Style (17th edition)