Mathematical Theory of Cryo-Electron Microscopy

Amit Singer, Princeton University
Fine Hall 314

The importance of determining three dimensional macromolecular structures for large biological molecules was recognized by the Nobel Prize in Chemistry awarded this year to V. Ramakrishnan, T. Steitz and A. Yonath for studies of the structure and function of the ribosome. The standard procedure for structure determination of large molecules is X-ray crystallography, where the challenge is often more in the crystallization itself than in the interpretation of the X-ray results, since many large proteins have so far withstood all attempts to crystallize them. In cryo-EM, an alternative to X-ray crystallography, the sample of macromolecules is rapidly frozen in an ice layer so thin that their tomographic projections taken by the electron microscope are typically disjoint. The cryo-EM imaging process produces a large collection of tomographic projections of the same molecule, corresponding to different and unknown projection orientations. The goal is to reconstruct the 3D structure of the molecule from such unlabeled 2D projection images, where data sets typically range from 104 to 105 projection images whose size is roughly 100 x 100 pixels. I will present a new algorithm for finding the unknown imaging directions of all projections. Compared with existing algorithms, the advantages of the algorithm are five-fold: first, it has a small estimation error even for images of very low signal-to-noise ratio (SNR); second, the algorithm is extremely fast, as it involves only the computation of a few top eigenvectors of a specially designed symmetric matrix; third, it is non-sequential and uses the information in all images at once; fourth, it is amenable to rigorous mathematical analysis using representation theory of the rotation group SO(3) and random matrix theory; finally, the algorithm is optimal in the sense that it reaches the information theoretic Shannon bound up to a constant. Time permitting, I will discuss generalizations of the algorithm and its mathematical analysis to other applications in computer vision, structural biology and dimensionality reduction.