Skip to Main Content
Frequently Asked Questions
Submit an ETD
Global Search Box
Need Help?
Keyword Search
Participating Institutions
Advanced Search
School Logo
Files
File List
case1130976001.pdf (1.1 MB)
ETD Abstract Container
Abstract Header
Distance-Based Indexing: Observations, Applications, and Improvements
Author Info
Tasan, Murat
Permalink:
http://rave.ohiolink.edu/etdc/view?acc_num=case1130976001
Abstract Details
Year and Degree
2006, Doctor of Philosophy, Case Western Reserve University, Computing and Information Science.
Abstract
Multidimensional indexing has long been an active research problem in computer science. Most solutions involve the mapping of complex data types to high-dimensional vectors of fixed length and applying either Spatial Access Methods (SAMs) or Point Access Methods (PAMs) to to the vectorized data. In more recent times, however, this approach has found its limitations. Much of the current data is either difficult to map to a fixed-length vector (such as arbitrary length strings), or maps only successfully to a very high number of dimensions. In both cases, Distance-Based Indexing serves as an attractive alternative, relying only on the pairwise distance information of data items to build indices that offer efficient similarity search retrieval. In this work, distance-based indexing is approached first in a general fashion, where a framework is laid out that encompasses both distance-based indexing methods as well as SAMs and PAMs. Shared properties of various seemingly unrelated data structures can be exploited, as is shown by the presentation of a single (and optimal) search algorithm that works on a variety of trees for a variety of different search types. The motivation for distance-based indexing is then shown via an application of indexing strings (biological sequences, to be exact). By simply showing that a distance function satisfies the properties of a metric, it is illustrated that many forms of data, with various distribution characteristics can successfully be indexed with distance-based indexing. Finally, a probabilistic approach towards indexing leads to an improved tree construction algorithm, as well as an information based search algorithm that searches the information stored in any data structure, regardless of the form (i.e. whether the structure is a tree or a matrix, the algorithm performs equally well).
Committee
Z. Meral Ozsoyoglu (Advisor)
Pages
198 p.
Subject Headings
Computer Science
Keywords
distance-based indexing
;
probabilistic indexing
;
data structures
Recommended Citations
Refworks
EndNote
RIS
Mendeley
Citations
Tasan, M. (2006).
Distance-Based Indexing: Observations, Applications, and Improvements
[Doctoral dissertation, Case Western Reserve University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=case1130976001
APA Style (7th edition)
Tasan, Murat.
Distance-Based Indexing: Observations, Applications, and Improvements.
2006. Case Western Reserve University, Doctoral dissertation.
OhioLINK Electronic Theses and Dissertations Center
, http://rave.ohiolink.edu/etdc/view?acc_num=case1130976001.
MLA Style (8th edition)
Tasan, Murat. "Distance-Based Indexing: Observations, Applications, and Improvements." Doctoral dissertation, Case Western Reserve University, 2006. http://rave.ohiolink.edu/etdc/view?acc_num=case1130976001
Chicago Manual of Style (17th edition)
Abstract Footer
Document number:
case1130976001
Download Count:
2,034
Copyright Info
© 2005, all rights reserved.
This open access ETD is published by Case Western Reserve University School of Graduate Studies and OhioLINK.