Skip to Main Content
 

Global Search Box

 
 
 
 

Files

ETD Abstract Container

Abstract Header

Topics in the Mathematics of Data Science

Abstract Details

2022, Doctor of Philosophy, Ohio State University, Mathematics.
We consider several problems involving the mathematics of data science and compressed sensing. First, we extend the techniques of Hugel, Rauhut and Strohmer [42] to give a construction of low-entropy random matrices that have non-uniform guarantees for compressed sensing by $\ell_{1}$ minimization. In particular, we show that for every $\delta\in(0,1]$, there exists an explicit random $m\times N$ partial Fourier matrix $A$ with $m\leq C_1(\delta)s\log^{4/\delta}(N/\epsilon)$ and entropy at most $C_2(\delta)s^\delta\log^5(N/\epsilon)$ such that for every $s$-sparse signal $x\in\mathbb{C}^N$, there exists an event of probability at least $1-\epsilon$ over which $x$ is the unique minimizer of $\|z\|_1$ subject to $Az=Ax$. The bulk of our analysis uses tools from decoupling to estimate the extreme singular values of the submatrix of $A$ whose columns correspond to the support of $x$. We continue by giving a Monte Carlo algorithm based on the Peng-Wei [74] relaxation of the $k$-means clustering problem to produce a high-confidence lower bound on the $k$-means objective. We provide numerical experiments on several datasets, and we prove a theoretical performance guarantee when data is drawn from a mixture of Gaussians. Next, we propose a Procrustes-type method for transfer learning, motivated by a classification problem for synthetic aperture radar (SAR) images. We give theoretical results that describe the sample complexity of the method and numerical results for the technique on a variety of datasets. We also apply the method to the SAR classification problem outlined in [58]. We conclude by analyzing the injectivity of single-layer and multi-layer ReLU networks with random weights. We consider the expansivity needed for ReLU layers with Gaussian weights to be injective with high probability, and we slightly improve on a bound given in [75]. We point out a connection to integral geometry as a future direction for research.
Dustin Mixon (Advisor)
149 p.

Recommended Citations

Citations

  • Clum, C. (2022). Topics in the Mathematics of Data Science [Doctoral dissertation, Ohio State University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=osu1638791911831519

    APA Style (7th edition)

  • Clum, Charles. Topics in the Mathematics of Data Science. 2022. Ohio State University, Doctoral dissertation. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=osu1638791911831519.

    MLA Style (8th edition)

  • Clum, Charles. "Topics in the Mathematics of Data Science." Doctoral dissertation, Ohio State University, 2022. http://rave.ohiolink.edu/etdc/view?acc_num=osu1638791911831519

    Chicago Manual of Style (17th edition)