Skip to Main Content
 

Global Search Box

 
 
 
 

ETD Abstract Container

Abstract Header

Improved Individual Ancestry Estimates for Proper Adjustment of Ancestral Confounding in Association Analysis

Abstract Details

2008, Doctor of Philosophy, Case Western Reserve University, Epidemiology and Biostatistics.

Case-control studies are susceptible to false positive findings because of population stratification either when allele frequencies differ between cases and control due to differences in ancestry distribution or when there is unrecognized stratification within cases or controls due to the presence of subgroups with different ancestries. Thus, accurate estimates of the admixture proportions at an individual level are important. Statistical methods exist to infer individual ancestry from genetic data, and to assign (probabilistically) admixed individuals jointly to two or more populations. We seek to improve the accuracy of individual ancestry estimates (IAEs) using these statistical approaches, thereby reducing the number of false-positive findings (due to ancestry) in case-control studies.

We evaluate several approaches to improve the accuracy of the IAEs using the methods implemented in Structure (Pritchard, Stephens, and Donnelly 2000), and in the principal components approach (Zhang et al. 2002). First, we considered whether using prior information to preselect the prior admixture distribution parameter (α) would improve the IAEs. We show that the IAEs are insensitive to the preselected α parameter. Second, we assess the importance of including pseudo-ancestral subjects (PAs) during the inference process and conclude that including PAs does not improve the accuracy of the IAEs when moderately or highly informative markers for ancestry are used. Third, we determine the number of markers required to obtain accurate IAEs, given the absolute allele frequency difference (δ) between parental populations of the preselected SNPs, the level of divergence between the parental populations, and the genetic contribution from the parental populations to the admixed sample. We show that the number of SNPs necessary to infer accurate IAEs not only depends on the distribution of δ values, but also on the range of ancestry contribution from the population that contributes less to the mixture. Finally, we determine whether combining sociocultural information (e.g., great grand-parental origin) with genetic information to infer a genetic background variable will reduce the number of false positive results. We show no statistically significant difference in the number of false positive results by incorporating great grand-parental origin with the SNP data to derive the genetic background variable for each subject. Our findings will improve the study design to control for population stratification in association studies of admixed populations.

Katrina Goddard, PhD (Committee Chair)
Ronald Blanton, PhD, MD (Committee Member)
Robert Elston, PhD (Committee Member)
Luo Yuqun, PhD (Committee Member)
Zhu Xiaofeng, PhD (Committee Member)
131 p.

Recommended Citations

Citations

  • Parrado, T. (2008). Improved Individual Ancestry Estimates for Proper Adjustment of Ancestral Confounding in Association Analysis [Doctoral dissertation, Case Western Reserve University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=case1216340419

    APA Style (7th edition)

  • Parrado, Tony. Improved Individual Ancestry Estimates for Proper Adjustment of Ancestral Confounding in Association Analysis. 2008. Case Western Reserve University, Doctoral dissertation. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=case1216340419.

    MLA Style (8th edition)

  • Parrado, Tony. "Improved Individual Ancestry Estimates for Proper Adjustment of Ancestral Confounding in Association Analysis." Doctoral dissertation, Case Western Reserve University, 2008. http://rave.ohiolink.edu/etdc/view?acc_num=case1216340419

    Chicago Manual of Style (17th edition)