On the low dimensional projections of high-dimensional point clouds

Andrea Montanari, Stanford University
Fine Hall 224

Given a cloud of $n$ data points in $\mathbb{R}^d$, consider all projections onto $m$-dimensional subspaces of $\mathbb{R}^d$ and, for each such projection, the empirical distribution of the projected points. What does this collection of probability distributions look like when $n,d$ grow large? We consider this question under the model in which the points are i.i.d. standard Gaussian vectors, focusing on the asymptotic regime in which $n,d$ diverge, with $n/d$ converging to a finite non-zero value, while $m$ is fixed. Denoting by $F_m$ the set of probability distributions in $\mathbb{R}^m$ that arise as low-dimensional projections in this limit, I will present new inner and outer bounds on $F_m$. In particular, these bounds determine the Wasserstein radius of $F_m$ up to logarithmic factors, and determine it exactly for $m=1$. I will also present bounds in terms of Kullback-Leibler divergence and Rényi information dimension.

The previous question has application to unsupervised learning methods, such as projection pursuit and independent component analysis. We introduce a version of the same problem that is relevant for supervised learning. As an application, we establish an upper bound on the interpolation threshold of two-layers neural networks with $m$ hidden neurons.

[Based on joint work with Kangjie Zhou]