How could a Superhuman AI mathematician come about?
How could a Superhuman AI mathematician come about?
Can AI systems exceed the capabilities of the human experts who provided their training data? The talk will examine the hypothesis of AI self‑improvement, involving mechanisms such as synthetic data generation, reinforcement learning, and tool‑augmented reasoning with formal verification loops.
I will also present recent work at Princeton, including the Gödel Prover V2 for Lean‑based theorem proving and a new inference pipeline that achieved state‑of‑the‑art performance (at the time of evaluation) on IMO‑ProofBench (Advanced) at moderate inference costs ($20–$30 per problem). These will illustrate how AI systems are sometimes able to escape “cognitive wells”—local optima in a model’s reasoning capabilities. While providing evidence for the feasibility of self‑improvement, they also highlight important hurdles and open questions.