Skip to Main Content
 

Global Search Box

 
 
 
 

ETD Abstract Container

Abstract Header

Computational Prediction of Protein-Protein Interactions on the Proteomic Scale Using Bayesian Ensemble of Multiple Feature Databases

Abstract Details

2011, Doctor of Philosophy, University of Akron, Biomedical Engineering.

In the post-genomic world, one of the most important and challenging problems is to understand protein-protein interactions (PPIs) on a large scale. They are integral to the underlying mechanisms of most of the fundamental cellular processes. A number of experimental methods such as protein affinity chromatography, affinity blotting, and immunoprecipitation have traditionally helped in detecting PPIs on a small scale. Recently, high-throughput methods have made available an increasing amount of PPI data. However, this data contains a significant amount of erroneous information in the form of false positives and false negatives and shows little overlap among PPIs pooled from different methods, thus severely limiting their reliability. Because of such limitations, computational predictions are emerging to narrow down the set of putative PPIs.

In this dissertation, a novel computational PPI predictor was devised to predict PPIs with high accuracy. The PPI predictor integrates a number of proteomic features derived from biological databases. The features chosen for the purpose of this research were gene expression, gene ontology, MIPS functions, sequence patterns such as motifs and domains, and protein essentiality. While these features have little or no correlation with each other, they share some degree of relationship with the ability of proteins to interact with each other. Therefore, novel feature specific approaches were devised to characterize that relationship. Text mining and network topology based approaches were also studied. Gold Standard data comprising of high confidence PPIs and non-PPIs was used as evidence of interaction or lack thereof.

The predictive power of the individual features was integrated using Bayesian methods. The average accuracy, based on 10-fold cross-validation, was found to be 0.9396. Since all the features are computed on the proteomic scale, the Bayesian integration yields likelihood values for all possible combinations of proteins in the proteome. This has the added benefit of making it possible to enlist putative PPIs in a decreasing order of confidence measure in the form of likelihood values.

Integration of novel PPIs with other relevant biological information using Semantic Web representation was examined to better understand the underlying mechanism of diseases and novel target identification for drug discovery.

Dr. Dale H. Mugler (Advisor)
Dr. Daniel B. Sheffer (Committee Member)
Dr. George C. Giakos (Committee Member)
Dr. Amy Milsted (Committee Member)
Dr. Daniel L. Ely (Committee Member)

Recommended Citations

Citations

  • Kumar, V. (2011). Computational Prediction of Protein-Protein Interactions on the Proteomic Scale Using Bayesian Ensemble of Multiple Feature Databases [Doctoral dissertation, University of Akron]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=akron1322489637

    APA Style (7th edition)

  • Kumar, Vivek. Computational Prediction of Protein-Protein Interactions on the Proteomic Scale Using Bayesian Ensemble of Multiple Feature Databases. 2011. University of Akron, Doctoral dissertation. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=akron1322489637.

    MLA Style (8th edition)

  • Kumar, Vivek. "Computational Prediction of Protein-Protein Interactions on the Proteomic Scale Using Bayesian Ensemble of Multiple Feature Databases." Doctoral dissertation, University of Akron, 2011. http://rave.ohiolink.edu/etdc/view?acc_num=akron1322489637

    Chicago Manual of Style (17th edition)