Topics in the Mathematics of Data Science

Clum, Charles

Keyword Search

School Logo

Dissertation.pdf (1.84 MB)

Topics in the Mathematics of Data Science

Author Info

Clum, Charles

Permalink:

http://rave.ohiolink.edu/etdc/view?acc_num=osu1638791911831519

Year and Degree

2022, Doctor of Philosophy, Ohio State University, Mathematics.

Abstract

We consider several problems involving the mathematics of data science and compressed sensing. First, we extend the techniques of Hugel, Rauhut and Strohmer [42] to give a construction of low-entropy random matrices that have non-uniform guarantees for compressed sensing by $\ell_{1}$ minimization. In particular, we show that for every $\delta\in(0,1]$, there exists an explicit random $m\times N$ partial Fourier matrix $A$ with $m\leq C_1(\delta)s\log^{4/\delta}(N/\epsilon)$ and entropy at most $C_2(\delta)s^\delta\log^5(N/\epsilon)$ such that for every $s$-sparse signal $x\in\mathbb{C}^N$, there exists an event of probability at least $1-\epsilon$ over which $x$ is the unique minimizer of $\|z\|_1$ subject to $Az=Ax$. The bulk of our analysis uses tools from decoupling to estimate the extreme singular values of the submatrix of $A$ whose columns correspond to the support of $x$. We continue by giving a Monte Carlo algorithm based on the Peng-Wei [74] relaxation of the $k$-means clustering problem to produce a high-confidence lower bound on the $k$-means objective. We provide numerical experiments on several datasets, and we prove a theoretical performance guarantee when data is drawn from a mixture of Gaussians. Next, we propose a Procrustes-type method for transfer learning, motivated by a classification problem for synthetic aperture radar (SAR) images. We give theoretical results that describe the sample complexity of the method and numerical results for the technique on a variety of datasets. We also apply the method to the SAR classification problem outlined in [58]. We conclude by analyzing the injectivity of single-layer and multi-layer ReLU networks with random weights. We consider the expansivity needed for ReLU layers with Gaussian weights to be injective with high probability, and we slightly improve on a bound given in [75]. We point out a connection to integral geometry as a future direction for research.

Committee

Dustin Mixon (Advisor)

Pages

149 p.

Subject Headings

Mathematics

Clum, C. (2022). Topics in the Mathematics of Data Science [Doctoral dissertation, Ohio State University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=osu1638791911831519
APA Style (7th edition)
Clum, Charles. Topics in the Mathematics of Data Science. 2022. Ohio State University, Doctoral dissertation. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=osu1638791911831519.
MLA Style (8th edition)
Clum, Charles. "Topics in the Mathematics of Data Science." Doctoral dissertation, Ohio State University, 2022. http://rave.ohiolink.edu/etdc/view?acc_num=osu1638791911831519
Chicago Manual of Style (17th edition)

Document number:

osu1638791911831519

Download Count:

205

Copyright Info

Global Search Box

Files

File List

ETD Abstract Container

Abstract Header

Topics in the Mathematics of Data Science

Abstract Details

Recommended Citations

Citations

Abstract Footer

Global Footer

Ohio Department of Higher Education

State Government Links

Education Links

Global Search Box

Files

File List

ETD Abstract Container

Abstract Header

Topics in the Mathematics of Data Science

Abstract Details

Recommended CitationsRefworksEndNoteRISMendeley

Citations

Abstract Footer

Global Footer

Ohio Department of Higher Education

State Government Links

Education Links

Recommended Citations