Skip to Main Content
 

Global Search Box

 
 
 
 

ETD Abstract Container

Abstract Header

Contributions to Numerical Formal Concept Analysis, Bayesian Predictive Inference and Sample Size Determination

Abstract Details

2011, Doctor of Philosophy, Case Western Reserve University, Statistics.

This dissertation contributes to three areas in Statistics: Numerical Formal Concept Analysis (nFCA), Bayesian predictive inference and sample size determination, and has applications beyond statistics.

Formal concept analysis (FCA) is a powerful data analysis tool, popular in Computer Science (CS), to visualize binary data and its inherent structure. In the first part of this dissertation, Numerical Formal Concept Analysis (nFCA) is developed. It overcomes FCA's limitation to provide a new methodology for analyzing more general numerical data. It combines the Statistics and Computer Science graphical visualization to provide a pair of nFCA graphs, H-graph and I-graph, to reveal the hierarchical clustering and inherent structure among the data. Comparing with conventional statistical hierarchical clustering methods, nFCA provides more intuitive and complete relational network among the data. nFCA performs better than the conventional hierarchical clustering methods in terms of the Cophenetic correlation coefficient which measures the consistency of a dendrogram to the original distance matrix. We have also applied nFCA to cardiovascular (CV) traits data. nFCA produces consistent results to the earlier discovery and provides a complete relational network among the CV traits.

In the second part of this dissertation, Bayesian predictive inference is investigated for finite population quantities under informative sampling, i.e., unequal selection probabilities. Only limited information about the sample design is available, i.e., only the first-order selection probabilities corresponding to the sampled units are known. We have developed a full Bayesian approach to make inference for the parameters of the finite population and also predictive inference for the non-sampled units. Thus we can make inference for any characteristic of the finite population quantities. In addition, our methodology, using Markov chain Monte Carlo, avoids the necessity of using asymptotic approximations.

Sample size determination is one of the most important practical tasks for statisticians. There has been extensive research to develop appropriate methodology for sample size determination, say, for continuous, or ordered categorical outcome data. However, sample size determination for comparative studies with unordered categorical data remains largely untouched. In terms of statistical terminology, one is interested in finding the sample size needed to detect a specified difference between the parameters of two multinomial distributions. For this purpose, in the third part of this dissertation, we have developed a frequentist approach based on a chi-squared test to calculate the required sampled size. Three improvement for the original frequentist approach (using bootstrap, minimum difference and asymptotic correction) have been proposed and investigated. In addition, using an extension of a posterior predictive p-value, we further develop a simulation-based Bayesian approach to determine the required sample size. The performance of these methods is evaluated via both a simulation study and a real application to Leukoplakia lesion data. Some asymptotic are also provided.

Jiayang Sun, PhD (Committee Chair)
Joe Sedransk, PhD (Advisor)
Mark Schluchter, PhD (Committee Member)
Guo-Qiang Zhang, PhD (Committee Member)
Tomas Radivoyevitch, PhD (Committee Member)
154 p.

Recommended Citations

Citations

  • Ma, J. (2011). Contributions to Numerical Formal Concept Analysis, Bayesian Predictive Inference and Sample Size Determination [Doctoral dissertation, Case Western Reserve University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=case1285341426

    APA Style (7th edition)

  • Ma, Junheng. Contributions to Numerical Formal Concept Analysis, Bayesian Predictive Inference and Sample Size Determination. 2011. Case Western Reserve University, Doctoral dissertation. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=case1285341426.

    MLA Style (8th edition)

  • Ma, Junheng. "Contributions to Numerical Formal Concept Analysis, Bayesian Predictive Inference and Sample Size Determination." Doctoral dissertation, Case Western Reserve University, 2011. http://rave.ohiolink.edu/etdc/view?acc_num=case1285341426

    Chicago Manual of Style (17th edition)