A polynomial time algorithm for lossy population recovery

-
Ankur Moitra , IAS

We give a polynomial time algorithm for the lossy population recovery problem. In this problem, the goal is to approximately learn an unknown distribution on binary strings of length n from lossy samples: for some parameter μ each coordinate of the sample is preserved with probability μ and otherwise is replaced by a ‘?’. The running time and number of samples needed for our algorithm is polynomial in n and 1/ε for each fixed μ > 0. This improves on algorithm of Wigderson and Yehudayoff that runs in quasi-polynomial time for any μ > 0 and the polynomial time algorithm of Dvir et al which was shown to work for μ > 0.30 by Batman et al. The algorithm we analyze is implicit in previous work; our main contribution is to analyze the algorithm by showing (via linear programming duality and connections to complex analysis) that a certain matrix associated with the problem has a robust local inverse even though its condition number is exponentially small. A corollary of our result is the first polynomial time algorithm for learning DNFs in the restriction access model of Dvir et al and hence this model joins the random walk model of Bshouty et al as the only examples of passive learning in which DNFs can be learned in polynomial time. Joint work with Mike Saks