APPROXIMATE N-NEAREST NEIGHBOR CLUSTERING ON DISTRIBUTED DATABASES USING ITERATIVE REFINEMENT

CALENDER, CHRISTOPHER R

Keyword Search

School Logo

ucin1092929952.pdf (5.26 MB)

APPROXIMATE N-NEAREST NEIGHBOR CLUSTERING ON DISTRIBUTED DATABASES USING ITERATIVE REFINEMENT

Author Info

CALENDER, CHRISTOPHER R

Permalink:

http://rave.ohiolink.edu/etdc/view?acc_num=ucin1092929952

Year and Degree

2004, MS, University of Cincinnati, Engineering : Computer Science.

Abstract

There are often times where an application needs to find the N nearest neighbors. Imagine the scene of an accident and someone needs to know all of the nearest fire and police departments. It may not be enough to only know one station; they may need to know the five nearest stations, or even more. That is where finding the N nearest neighbors is the perfect algorithm. Finding the N nearest neighbors is not a new concept, but doing this on distributed databases while reducing the traditional communication costs is new. There are existing ways to accomplish this by using a centralized method in which each individual database site transmits all of its data to one central machine, a learner machine, that then performs the necessary calculations. However, communication costs are very expensive, much more expensive than computational costs, and that is why there is a need for an algorithm that can perform finding the N nearest neighbors using distributed databases while minimizing communication costs. In this approach, more power is given to the individual database sites, and less to the central learning machine. What has been attempted is to allow the individual database sites to perform calculations on their data in hopes of gaining some kind of summary information that will allow a minimal amount of information to be transmitted to the learner machine rather than every point. And in fact, that is exactly what this newly proposed algorithm does. It has the individual database sites perform many calculations and then transmit only the relevant information to the learner machine. The learner machine also does computations and gathers information from all of the individual database sites. Once it has gathered some global information, it then sends out the important, relevant global information to each of the database sites. Then, in turn, each individual database site can return only the points that it feels will be relevant. This drastically reduces the amount of communication and, as one will see, works very well using real-world examples.

Committee

Dr. Raj Bhatnagar (Advisor)

Pages

118 p.

Subject Headings

Computer Science

Keywords

nearest neighbor; cluster; K-means; clustering; approximate nearest neighbor; approximate k-means

CALENDER, C. R. (2004). APPROXIMATE N-NEAREST NEIGHBOR CLUSTERING ON DISTRIBUTED DATABASES USING ITERATIVE REFINEMENT [Master's thesis, University of Cincinnati]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1092929952
APA Style (7th edition)
CALENDER, CHRISTOPHER. APPROXIMATE N-NEAREST NEIGHBOR CLUSTERING ON DISTRIBUTED DATABASES USING ITERATIVE REFINEMENT. 2004. University of Cincinnati, Master's thesis. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=ucin1092929952.
MLA Style (8th edition)
CALENDER, CHRISTOPHER. "APPROXIMATE N-NEAREST NEIGHBOR CLUSTERING ON DISTRIBUTED DATABASES USING ITERATIVE REFINEMENT." Master's thesis, University of Cincinnati, 2004. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1092929952
Chicago Manual of Style (17th edition)

Document number:

ucin1092929952

Download Count:

583

Copyright Info

Global Search Box

Files

File List

ETD Abstract Container

Abstract Header

APPROXIMATE N-NEAREST NEIGHBOR CLUSTERING ON DISTRIBUTED DATABASES USING ITERATIVE REFINEMENT

Abstract Details

Recommended Citations

Citations

Abstract Footer

Global Footer

Ohio Department of Higher Education

State Government Links

Education Links

Global Search Box

Files

File List

ETD Abstract Container

Abstract Header

APPROXIMATE N-NEAREST NEIGHBOR CLUSTERING ON DISTRIBUTED DATABASES USING ITERATIVE REFINEMENT

Abstract Details

Recommended CitationsRefworksEndNoteRISMendeley

Citations

Abstract Footer

Global Footer

Ohio Department of Higher Education

State Government Links

Education Links

Recommended Citations