Skip to Main Content
 

Global Search Box

 
 
 
 

Files

ETD Abstract Container

Abstract Header

Distance-Based Indexing: Observations, Applications, and Improvements

Tasan, Murat

Abstract Details

2006, Doctor of Philosophy, Case Western Reserve University, Computing and Information Science.
Multidimensional indexing has long been an active research problem in computer science. Most solutions involve the mapping of complex data types to high-dimensional vectors of fixed length and applying either Spatial Access Methods (SAMs) or Point Access Methods (PAMs) to to the vectorized data. In more recent times, however, this approach has found its limitations. Much of the current data is either difficult to map to a fixed-length vector (such as arbitrary length strings), or maps only successfully to a very high number of dimensions. In both cases, Distance-Based Indexing serves as an attractive alternative, relying only on the pairwise distance information of data items to build indices that offer efficient similarity search retrieval. In this work, distance-based indexing is approached first in a general fashion, where a framework is laid out that encompasses both distance-based indexing methods as well as SAMs and PAMs. Shared properties of various seemingly unrelated data structures can be exploited, as is shown by the presentation of a single (and optimal) search algorithm that works on a variety of trees for a variety of different search types. The motivation for distance-based indexing is then shown via an application of indexing strings (biological sequences, to be exact). By simply showing that a distance function satisfies the properties of a metric, it is illustrated that many forms of data, with various distribution characteristics can successfully be indexed with distance-based indexing. Finally, a probabilistic approach towards indexing leads to an improved tree construction algorithm, as well as an information based search algorithm that searches the information stored in any data structure, regardless of the form (i.e. whether the structure is a tree or a matrix, the algorithm performs equally well).
Z. Meral Ozsoyoglu (Advisor)
198 p.

Recommended Citations

Citations

  • Tasan, M. (2006). Distance-Based Indexing: Observations, Applications, and Improvements [Doctoral dissertation, Case Western Reserve University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=case1130976001

    APA Style (7th edition)

  • Tasan, Murat. Distance-Based Indexing: Observations, Applications, and Improvements. 2006. Case Western Reserve University, Doctoral dissertation. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=case1130976001.

    MLA Style (8th edition)

  • Tasan, Murat. "Distance-Based Indexing: Observations, Applications, and Improvements." Doctoral dissertation, Case Western Reserve University, 2006. http://rave.ohiolink.edu/etdc/view?acc_num=case1130976001

    Chicago Manual of Style (17th edition)