Skip to Main Content
Frequently Asked Questions
Submit an ETD
Global Search Box
Need Help?
Keyword Search
Participating Institutions
Advanced Search
School Logo
Files
File List
PhD_Thesis_Kritika_Singhal.pdf (2.26 MB)
ETD Abstract Container
Abstract Header
Geometric Methods for Simplification and Comparison of Data Sets
Author Info
Singhal, Kritika
Permalink:
http://rave.ohiolink.edu/etdc/view?acc_num=osu1587253879303425
Abstract Details
Year and Degree
2020, Doctor of Philosophy, Ohio State University, Mathematics.
Abstract
Finite metric spaces are increasingly being used to model data sets that come equipped with a dissimilarity measure. Some examples of such data sets are DNA sequences, news articles, social networks and feature vectors in machine learning applications. Due to the ever increasing volume of data, the simplification of finite metric spaces is a fundamental problem in data analysis. In this thesis, new methods of simplification of compact metric spaces are introduced. There are three primary concepts used for simplification - fractal dimension, clustering and representative subspaces. The first concept is that of fractal dimension of a finite metric space. The effect of fractal dimension on the computational complexity of geometric problems such as the Traveling Salesman Problem, and the Independent Set of Unit Balls problem is studied. The running time lower bounds for these problems on metric spaces with non-integer dimension are computed and it is shown that these bounds nearly match previously known upper bounds. The second notion that is used for simplification of a compact metric space is clustering. Precisely, a new method for approximating a compact metric space with a k-point metric space is introduced. It is shown that approximating with k points is equivalent to certain notions of clustering of the input metric space into k blocks. Similar results are obtained for metric measure spaces. Using the equivalence results for metric spaces and metric measure spaces, polynomial time algorithms for computing the k-point approximations are developed. Another method of simplification of compact metric spaces is developed through identifying finite representative subspaces of the input metric space. The subspaces are representative because they store the entire metric information of the original metric space, up to a loss of epsilon. As a result, this method identifies a finite compression of the input metric space. A polynomial time exact algorithm is developed for computing a finite compression of smallest size, with zero loss, and a polynomial time approximation algorithm is developed for computing a finite compression of smallest size, with loss of epsilon greater than zero. Furthermore, a method for reconstructing the original metric space from the compressed metric space is described. Two transforms of the distance function of a metric measure space are also studied. These transforms are referred to as the Mass Transform and the Hyperbolicity Transform, as their definition uses sets of large mass and small hyperbolicity respectively. The theoretical and computational properties of these transforms are studied. Finally, new lower bounds on the Gromov-Hausdorff distance between compact metric spaces are computed using some of the simplification methods. New lower bounds on this distance are also computed using persistent homology and curvature sets of the input metric spaces.
Committee
Facundo Memoli (Advisor)
Anastasios Sidiropoulos (Advisor)
Tamal Dey (Committee Member)
David Sivakoff (Committee Member)
Pages
265 p.
Subject Headings
Applied Mathematics
;
Mathematics
Keywords
Finite Approximation, Finite Compression, Gromov-Hausdorff Distance, Clustering, Fractal Dimension
Recommended Citations
Refworks
EndNote
RIS
Mendeley
Citations
Singhal, K. (2020).
Geometric Methods for Simplification and Comparison of Data Sets
[Doctoral dissertation, Ohio State University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=osu1587253879303425
APA Style (7th edition)
Singhal, Kritika.
Geometric Methods for Simplification and Comparison of Data Sets.
2020. Ohio State University, Doctoral dissertation.
OhioLINK Electronic Theses and Dissertations Center
, http://rave.ohiolink.edu/etdc/view?acc_num=osu1587253879303425.
MLA Style (8th edition)
Singhal, Kritika. "Geometric Methods for Simplification and Comparison of Data Sets." Doctoral dissertation, Ohio State University, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=osu1587253879303425
Chicago Manual of Style (17th edition)
Abstract Footer
Document number:
osu1587253879303425
Download Count:
473
Copyright Info
© 2020, some rights reserved.
Geometric Methods for Simplification and Comparison of Data Sets by Kritika Singhal is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License. Based on a work at etd.ohiolink.edu.
This open access ETD is published by The Ohio State University and OhioLINK.