Skip to Main Content
 

Global Search Box

 
 
 
 

ETD Abstract Container

Abstract Header

Protein Function Prediction Using Decision Tree Technique

Yedida, Venkata Rama Kumar Swamy

Abstract Details

2008, Master of Science, University of Akron, Computer Science.
The human genome project and numerous other genome projects have produced a large and ever increasing amount of sequence data. One of the main research challenges in the post-genomic era is to understand the relationship between the nucleotide sequences of genes and the functions of the proteins they encode. The objective of this thesis is to develop an automated protein function prediction system that is based on a set of homologous proteins and gene ontology categories. A novel measure based on a set of best local alignments is used to identify the homologues. The biological functions of the homologous proteins are characterized with gene ontology annotations. The protein function prediction is performed based on data mining models using decision trees. The models are trained and tested using the complete proteome of model organism yeast. The results show that the prediction accuracy depends on individual functional groups of proteins. There is a general trend of decreased model accuracy with the level of a group on the gene ontology graph, but the accuracy at a fix level varies from group to group. The prediction accuracy varies from group to group, no obvious accuracy changes from one level to another. These variations of accuracy illustrate certain limitations of sequence-based protein function prediction methods. But the fundamental assumption used in this thesis, similar amino acid sequences implying similar biological functions, is largely valid. The prediction results based on the proteome of yeast indicate that the accuracies for most of the functional groups are over 75%. We conclude that the decision tree model can be used as a preliminary tool for protein function prediction although the prediction results need to be verified through other means.
Zhong-Hui Duan (Advisor)
89 p.

Recommended Citations

Citations

  • Yedida, V. R. K. S. (2008). Protein Function Prediction Using Decision Tree Technique [Master's thesis, University of Akron]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=akron1216313412

    APA Style (7th edition)

  • Yedida, Venkata Rama Kumar Swamy. Protein Function Prediction Using Decision Tree Technique. 2008. University of Akron, Master's thesis. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=akron1216313412.

    MLA Style (8th edition)

  • Yedida, Venkata Rama Kumar Swamy. "Protein Function Prediction Using Decision Tree Technique." Master's thesis, University of Akron, 2008. http://rave.ohiolink.edu/etdc/view?acc_num=akron1216313412

    Chicago Manual of Style (17th edition)