Skip to Main Content
Frequently Asked Questions
Submit an ETD
Global Search Box
Need Help?
Keyword Search
Participating Institutions
Advanced Search
School Logo
Files
File List
zhengchao_s_dissertation-33.pdf (19.48 MB)
ETD Abstract Container
Abstract Header
Distances within and between Metric Spaces: Metric Geometry, Optimal Transport and Applications to Data Analysis
Author Info
Wan, Zhengchao
ORCID® Identifier
http://orcid.org/0000-0003-4388-6991
Permalink:
http://rave.ohiolink.edu/etdc/view?acc_num=osu1635979369662852
Abstract Details
Year and Degree
2021, Doctor of Philosophy, Ohio State University, Mathematics.
Abstract
Distance functions are important in data analysis for modelling dissimilarity between data points. Utilizing tools from metric geometry and optimal transport, we study foundations of data analysis by examining various distances between and within metric (measure) spaces. The dissertation can be roughly divided into three parts. In the first part, we study problems related to the computational aspect of the Gromov-Hausdorff distance $d_\mathcal{GH}$. $d_\mathcal{GH}$ is a natural distance for comparing metric spaces which satisfies many pleasing mathematical properties. However, the distance is known to be NP-hard to compute in general. Despite this hardness, we devise a fixed-parameter tractable algorithm for computing $d_\mathcal{GH}$ between ultrametric spaces. In the course of discovering this algorithm, we identify a one parameter family of Gromov-Hausdorff type distances $\{d_{\mathcal{GH},p}\}_{p=1}^\infty$ such that $d_{\mathcal{GH},1}=d_\mathcal{GH}$ and $d_{\mathcal{GH},\infty}$, also denoted by $u_\mathcal{GH}$, is an ultrametric itself on the collection $\mathcal{U}$ of compact ultrametric spaces. Whereas for any $p\in[1,\infty)$, $d_{\mathcal{GH},p}$ is NP-hard to compute, we establish that $u_\mathcal{GH}$ can be computed in polynomial time. We further study geometric properties of the space $(\mathcal{U},u_\mathcal{GH})$ and establish universality of the space. There exist two natural analogues of $d_\mathcal{GH}$ in the case of metric measure spaces, namely, the Gromov-Wasserstein distance and Sturm's Gromov-Wasserstein distance. We consider their ultrametric versions using definitions in analogy to $u_\mathcal{GH}$ and study both of their theoretical and experimental behaviors. In the second part, we establish characterizations for Gromov-Hausdorff type geodesics. More precisely, we prove that every Gromov-Hausdorff geodesic is, in fact, a Hausdorff geodesic. This in turn motivated us to study Hausdorff geodesics. We then provide a complete characterization of Hausdorff geodesics and prove that every Hausdorff geodesic is equivalent to a Hausdorff displacement interpolation, which is a notion we adapted from the famous displacement interpolation in optimal transport. Finally, we extend our study to geodesics between metric measure spaces and provide a characterization for a large family of Sturm's Gromov-Wasserstein geodesics. In the final part, we study an unsupervised learning framework called Wasserstein transform (WT), which permits metric spaces to be iteratively updated via local information in order to denoise data sets or enhance features. We propose several instances under the framework and show that the famous mode seeking mean shift algorithm fits into our framework. Theoretically, we establish stability results for the proposed instances of WT and reveal a connection between WT and the classic Ricci flow. Experimentally, our methods turn out to be useful for resolving the notorious chaining effect from hierarchical clustering and for other important tasks such as image segmentation and natural language processing. This dissertation is written as a comprehensive version of the author’s works [130, 139, 97, 132, 126, 193, 131] during his PhD study.
Committee
Facundo Mémoli (Advisor)
David Sivakoff (Committee Member)
Matthew Kahle (Committee Member)
Pages
444 p.
Subject Headings
Mathematics
Keywords
metric space
;
ultrametric space
;
Gromov-Hausdorff distance
;
Gromov-Wasserstein distance
;
optimal transport
Recommended Citations
Refworks
EndNote
RIS
Mendeley
Citations
Wan, Z. (2021).
Distances within and between Metric Spaces: Metric Geometry, Optimal Transport and Applications to Data Analysis
[Doctoral dissertation, Ohio State University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=osu1635979369662852
APA Style (7th edition)
Wan, Zhengchao.
Distances within and between Metric Spaces: Metric Geometry, Optimal Transport and Applications to Data Analysis.
2021. Ohio State University, Doctoral dissertation.
OhioLINK Electronic Theses and Dissertations Center
, http://rave.ohiolink.edu/etdc/view?acc_num=osu1635979369662852.
MLA Style (8th edition)
Wan, Zhengchao. "Distances within and between Metric Spaces: Metric Geometry, Optimal Transport and Applications to Data Analysis." Doctoral dissertation, Ohio State University, 2021. http://rave.ohiolink.edu/etdc/view?acc_num=osu1635979369662852
Chicago Manual of Style (17th edition)
Abstract Footer
Document number:
osu1635979369662852
Download Count:
184
Copyright Info
© 2021, all rights reserved.
This open access ETD is published by The Ohio State University and OhioLINK.