Skip to Main Content
Frequently Asked Questions
Submit an ETD
Global Search Box
Need Help?
Keyword Search
Participating Institutions
Advanced Search
School Logo
Files
File List
akron1216313412.pdf (473.33 KB)
ETD Abstract Container
Abstract Header
Protein Function Prediction Using Decision Tree Technique
Author Info
Yedida, Venkata Rama Kumar Swamy
Permalink:
http://rave.ohiolink.edu/etdc/view?acc_num=akron1216313412
Abstract Details
Year and Degree
2008, Master of Science, University of Akron, Computer Science.
Abstract
The human genome project and numerous other genome projects have produced a large and ever increasing amount of sequence data. One of the main research challenges in the post-genomic era is to understand the relationship between the nucleotide sequences of genes and the functions of the proteins they encode. The objective of this thesis is to develop an automated protein function prediction system that is based on a set of homologous proteins and gene ontology categories. A novel measure based on a set of best local alignments is used to identify the homologues. The biological functions of the homologous proteins are characterized with gene ontology annotations. The protein function prediction is performed based on data mining models using decision trees. The models are trained and tested using the complete proteome of model organism yeast. The results show that the prediction accuracy depends on individual functional groups of proteins. There is a general trend of decreased model accuracy with the level of a group on the gene ontology graph, but the accuracy at a fix level varies from group to group. The prediction accuracy varies from group to group, no obvious accuracy changes from one level to another. These variations of accuracy illustrate certain limitations of sequence-based protein function prediction methods. But the fundamental assumption used in this thesis, similar amino acid sequences implying similar biological functions, is largely valid. The prediction results based on the proteome of yeast indicate that the accuracies for most of the functional groups are over 75%. We conclude that the decision tree model can be used as a preliminary tool for protein function prediction although the prediction results need to be verified through other means.
Committee
Zhong-Hui Duan (Advisor)
Pages
89 p.
Subject Headings
Bioinformatics
Keywords
protein function prediction
;
decision tree
;
data mining
;
gene ontology
;
sequence similairty
;
Recommended Citations
Refworks
EndNote
RIS
Mendeley
Citations
Yedida, V. R. K. S. (2008).
Protein Function Prediction Using Decision Tree Technique
[Master's thesis, University of Akron]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=akron1216313412
APA Style (7th edition)
Yedida, Venkata Rama Kumar Swamy.
Protein Function Prediction Using Decision Tree Technique.
2008. University of Akron, Master's thesis.
OhioLINK Electronic Theses and Dissertations Center
, http://rave.ohiolink.edu/etdc/view?acc_num=akron1216313412.
MLA Style (8th edition)
Yedida, Venkata Rama Kumar Swamy. "Protein Function Prediction Using Decision Tree Technique." Master's thesis, University of Akron, 2008. http://rave.ohiolink.edu/etdc/view?acc_num=akron1216313412
Chicago Manual of Style (17th edition)
Abstract Footer
Document number:
akron1216313412
Download Count:
936
Copyright Info
© 2008, all rights reserved.
This open access ETD is published by University of Akron and OhioLINK.