Learning Information Obfuscation

Guillermo Sapiro, Duke University
Fine Hall 224

Data collection and sharing are a pervasive and inescapable aspect of modern society. This process can either be voluntary, as in the case of a person taking a facial image to unlock their phone (identity verification) or incidental, such as traffic cameras collecting videos on walking pedestrians. An undesirable side effect of these processes is that shared data can carry information about attributes that users might consider as sensitive. It is therefore desirable for both data collectors and users to dynamically design procedures to minimize sensitive information leakage. Balancing the competing objectives of providing meaningful service levels (individualized inference) while protecting sensitive information is still an open problem. In this work we address this as a distribution matching problem, where the goal is to learn a conditional distribution on the observed data, and formulate this as a constrained optimization problem. We show how we can adapt this constrained optimization problem into an unconstrained adversarial game played by two (or more) adversarial networks in a principled, data-driven manner.  This type of approach allows us to tackle hard to model tasks, such as hiding gender recognition from a face image while preserving subject verification. It also enables us to learn space-preserving conditional distributions that preserve performance on existing algorithms, thus allowing the reuse of existing algorithms on this new, filtered data. We show bounds on the best possible performance of these sanitization algorithms that can be computed without modeling the data observation process, a characteristic that makes these bounds easy to compute a-priori.

We illustrate the use of the framework on two use-cases; subject-within-subject, where we tackle the problem of having a face identity detector collect data only on a consenting subset of users, an important application, for example, for mobile devices activated by face recognition; and emotion-and-gender, where we hide independent variables, as is the case of hiding gender while preserving emotion detection. Joint work with M. Bertran, M. Martinez, G. Reeves, and M. Rodrigues.

Guillermo Sapiro received his B.Sc. (summa cum laude), M.Sc., and Ph.D. from the Department of Electrical Engineering at the Technion, Israel Institute of Technology. After post-doctoral research at MIT, Dr. Sapiro became Member of Technical Staff at the research facilities of HP Labs in Palo Alto, California. He was with the Department of Electrical and Computer Engineering at the University of Minnesota, where he held the position of Distinguished McKnight University Professor and Vincentine Hermes-Luh Chair in Electrical and Computer Engineering. Currently he is a James B. Duke Professor and the inaugural Microsoft Data Science Investigator with Duke University.

G. Sapiro was awarded the Gutwirth Scholarship for Special Excellence in Graduate Studies in 1991, the  Ollendorff Fellowship for Excellence in Vision and Image Understanding Work in 1992, the Rothschild Fellowship for Post-Doctoral Studies in 1993, the Office of Naval Research Young Investigator Award in 1998, the Presidential Early Career Awards for Scientist and Engineers (PECASE) in 1998, the National Science Foundation Career Award in 1999, and the National Security Science and Engineering Faculty Fellowship in 2010. He received the test of time award at ICCV. He is a Fellow of IEEE, SIAM, and the American Academy of Arts and Sciences (AAAS). He was the founding Editor-in-Chief of the SIAM Journal on Imaging Sciences.