Skip to Main Content
Frequently Asked Questions
Submit an ETD
Global Search Box
Need Help?
Keyword Search
Participating Institutions
Advanced Search
School Logo
Files
File List
Dissertation.pdf (1.84 MB)
ETD Abstract Container
Abstract Header
Topics in the Mathematics of Data Science
Author Info
Clum, Charles
Permalink:
http://rave.ohiolink.edu/etdc/view?acc_num=osu1638791911831519
Abstract Details
Year and Degree
2022, Doctor of Philosophy, Ohio State University, Mathematics.
Abstract
We consider several problems involving the mathematics of data science and compressed sensing. First, we extend the techniques of Hugel, Rauhut and Strohmer [42] to give a construction of low-entropy random matrices that have non-uniform guarantees for compressed sensing by $\ell_{1}$ minimization. In particular, we show that for every $\delta\in(0,1]$, there exists an explicit random $m\times N$ partial Fourier matrix $A$ with $m\leq C_1(\delta)s\log^{4/\delta}(N/\epsilon)$ and entropy at most $C_2(\delta)s^\delta\log^5(N/\epsilon)$ such that for every $s$-sparse signal $x\in\mathbb{C}^N$, there exists an event of probability at least $1-\epsilon$ over which $x$ is the unique minimizer of $\|z\|_1$ subject to $Az=Ax$. The bulk of our analysis uses tools from decoupling to estimate the extreme singular values of the submatrix of $A$ whose columns correspond to the support of $x$. We continue by giving a Monte Carlo algorithm based on the Peng-Wei [74] relaxation of the $k$-means clustering problem to produce a high-confidence lower bound on the $k$-means objective. We provide numerical experiments on several datasets, and we prove a theoretical performance guarantee when data is drawn from a mixture of Gaussians. Next, we propose a Procrustes-type method for transfer learning, motivated by a classification problem for synthetic aperture radar (SAR) images. We give theoretical results that describe the sample complexity of the method and numerical results for the technique on a variety of datasets. We also apply the method to the SAR classification problem outlined in [58]. We conclude by analyzing the injectivity of single-layer and multi-layer ReLU networks with random weights. We consider the expansivity needed for ReLU layers with Gaussian weights to be injective with high probability, and we slightly improve on a bound given in [75]. We point out a connection to integral geometry as a future direction for research.
Committee
Dustin Mixon (Advisor)
Pages
149 p.
Subject Headings
Mathematics
Recommended Citations
Refworks
EndNote
RIS
Mendeley
Citations
Clum, C. (2022).
Topics in the Mathematics of Data Science
[Doctoral dissertation, Ohio State University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=osu1638791911831519
APA Style (7th edition)
Clum, Charles.
Topics in the Mathematics of Data Science.
2022. Ohio State University, Doctoral dissertation.
OhioLINK Electronic Theses and Dissertations Center
, http://rave.ohiolink.edu/etdc/view?acc_num=osu1638791911831519.
MLA Style (8th edition)
Clum, Charles. "Topics in the Mathematics of Data Science." Doctoral dissertation, Ohio State University, 2022. http://rave.ohiolink.edu/etdc/view?acc_num=osu1638791911831519
Chicago Manual of Style (17th edition)
Abstract Footer
Document number:
osu1638791911831519
Download Count:
205
Copyright Info
© 2022, all rights reserved.
This open access ETD is published by The Ohio State University and OhioLINK.