Estimating the Covariance Spectrum

Tuesday, April 18, 2017 -
2:30pm to 3:30pm
Please note special day (Tuesday).   Suppose one wishes to accurately recover the set of eigenvalues of the covariance matrix of a distribution, given access to samples from the distribution. To what extent can this set be accurately approximated given an amount of data that is sublinear in the dimension of the space? The naive empirical estimate has poor performance when the number of samples is linear or sub-linear in the dimension of the data. In the "large sample, large dimension" setting, we proposed an efficient and information theoretically near-optimal algorithm to learn the moments of the covariance spectral distribution. Further we show that given the first k moments of a distribution, we can pin down the distribution in Earthmover distance up to an error of O(1/k). These two results combined allow us to efficiently and accurately learn the spectrum of the underlying covariance matrix, yielding a consistent spectrum estimator even with sub-linear number of samples.
Weihao Kong
Stanford University
Event Location: