Skip to Main Content
Frequently Asked Questions
Submit an ETD
Global Search Box
Need Help?
Keyword Search
Participating Institutions
Advanced Search
School Logo
Files
File List
wright1279304144.pdf (1.51 MB)
ETD Abstract Container
Abstract Header
PATTERNS OF DIPEPTIDE USAGE FOR GENE PREDICTION
Author Info
Gangadharaiah, Dayananda Sagar
Permalink:
http://rave.ohiolink.edu/etdc/view?acc_num=wright1279304144
Abstract Details
Year and Degree
2010, Master of Science in Computer Engineering (MSCE), Wright State University, Computer Engineering.
Abstract
As the number of complete genomes that have been sequenced continues to grow rapidly, the identification of genes regions in DNA sequence data remains one of the most important open problems in bio-informatics. Improving the accuracy of such gene finding tools by a small percentage would affect accurate predictions of many genes of an organism (Zhu et al., 2010). This thesis presents a novel approach for identifying coding regions of a genome based on dipeptide usage. The patterns in dipeptide usage are used to discriminate between coding and non-coding DNA regions. Two sample T-tests are used as tests of significance to determine the dipeptides that show significant difference in their occurrences in coding and non-coding regions. These methods are primarily tested on Escherichia coli -536 genome, where they reached an accuracy of 96.5% in identifying coding region and 100% accuracy in identifying non-coding regions. The trained classifier data Escherichia coli-536's genome is utilized to predict the coding and non-coding regions of Salmonella enterica subsp. enterica serovar Typhi's genome. The results of these experiments showed an accuracy of 79.5% in predicting coding regions and 100% in predicting non-coding regions of Salmonella enterica subsp. enterica serovar Typhi's genome.
Committee
Travis Doom, PhD (Advisor)
Michael Raymer, PhD (Committee Member)
Sridhar Ramachandran, PhD (Committee Member)
Pages
119 p.
Subject Headings
Bioinformatics
Keywords
DIPEPTIDE
;
coding regions
;
coding
;
endl
;
coding and non-coding regions
;
coding and non-coding
;
char
Recommended Citations
Refworks
EndNote
RIS
Mendeley
Citations
Gangadharaiah, D. S. (2010).
PATTERNS OF DIPEPTIDE USAGE FOR GENE PREDICTION
[Master's thesis, Wright State University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=wright1279304144
APA Style (7th edition)
Gangadharaiah, Dayananda Sagar.
PATTERNS OF DIPEPTIDE USAGE FOR GENE PREDICTION.
2010. Wright State University, Master's thesis.
OhioLINK Electronic Theses and Dissertations Center
, http://rave.ohiolink.edu/etdc/view?acc_num=wright1279304144.
MLA Style (8th edition)
Gangadharaiah, Dayananda Sagar. "PATTERNS OF DIPEPTIDE USAGE FOR GENE PREDICTION." Master's thesis, Wright State University, 2010. http://rave.ohiolink.edu/etdc/view?acc_num=wright1279304144
Chicago Manual of Style (17th edition)
Abstract Footer
Document number:
wright1279304144
Download Count:
768
Copyright Info
© 2010, all rights reserved.
This open access ETD is published by Wright State University and OhioLINK.