Skip to Main Content
Frequently Asked Questions
Submit an ETD
Global Search Box
Need Help?
Keyword Search
Participating Institutions
Advanced Search
School Logo
Files
File List
Powell with signature.pdf (1.42 MB)
ETD Abstract Container
Abstract Header
Amino Acid Properties Provide Insight to a Protein’s Subcellular Location
Author Info
Powell, Brian T
Permalink:
http://rave.ohiolink.edu/etdc/view?acc_num=ysu1484694077480789
Abstract Details
Year and Degree
2016, Master of Computing and Information Systems, Youngstown State University, Department of Computer Science and Information Systems.
Abstract
Current approaches of predicting subcellular locations of proteins located in a cell have made some advances but are far from perfect. Accurately predicting these locations result in better annotations of that protein and provide clearer pictures of its functions. We approach this problem by using a chaos game representation of the sequence based on physical and chemical properties of amino acids. We then split the resulting graph into two related discrete series, which is then subjected to wavelet transformation. The wavelet transformation data is then used as input for our classification algorithms. We observe the accuracy of how well each property predicts the correct subcellular location. We aim to achieve above the threshold of 45 percent accuracy, which is the average of existing general sub-cellular predic- tors. For our study protein sequences were obtained from Uniprot’s freely acces- sible repositories. We parsed data from five different classes, consisting of plant, fungal, mammal, human, and rodent proteins. We accommodate 10 subcellular locations: Nucleus, Membrane, Cytoplasm, Endoplasmic Reticulum, Secreted, Mi- tochondria, Cell Membrane, Vacuole, Golgi Apparatus, and Chloroplast. Protein sequences comprised of 20 amino acids are sorted into groups of four based on the selected property of amino acids. These groups allow the sequence to be plotted using 2-dimension chaos game theory. The resulting graph retains the sequence order in numerical form. Looking at the graph with a human eye we can’t deduce any information. To address this, we split the graph into two related discrete series based on the x-axis and y-axis. We then use a 3-level Haar wavelet transformation. Each level provides us with a detail coefficient vector the length of our sequence. For each detail coefficient vector we calculate the mean, min, max, and standard deviation. This provides us with 24 features to be used as input for classification. We run a variety of classifiers to assess the importance of amino acid properties.
Committee
Alina Lazar, PhD (Advisor)
Xiangjia Min, PhD (Committee Member)
Feng Yu, PhD (Committee Member)
Pages
29 p.
Subject Headings
Bioinformatics
;
Computer Science
Keywords
Subcellular Prediction
;
Machine Learning
Recommended Citations
Refworks
EndNote
RIS
Mendeley
Citations
Powell, B. T. (2016).
Amino Acid Properties Provide Insight to a Protein’s Subcellular Location
[Master's thesis, Youngstown State University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=ysu1484694077480789
APA Style (7th edition)
Powell, Brian.
Amino Acid Properties Provide Insight to a Protein’s Subcellular Location.
2016. Youngstown State University, Master's thesis.
OhioLINK Electronic Theses and Dissertations Center
, http://rave.ohiolink.edu/etdc/view?acc_num=ysu1484694077480789.
MLA Style (8th edition)
Powell, Brian. "Amino Acid Properties Provide Insight to a Protein’s Subcellular Location." Master's thesis, Youngstown State University, 2016. http://rave.ohiolink.edu/etdc/view?acc_num=ysu1484694077480789
Chicago Manual of Style (17th edition)
Abstract Footer
Document number:
ysu1484694077480789
Download Count:
448
Copyright Info
© 2016, all rights reserved.
This open access ETD is published by Youngstown State University and OhioLINK.