# The month AI started doing real mathematics

> In May 2026 AI from OpenAI and Google produced genuinely new, checkable mathematics.

*Two labs, days apart, used AI to produce genuinely new maths. How they did it matters more than that they did it.*

By WireRead Editorial · WireRead
Canonical: https://wireread.com/news/ai-real-mathematics-erdos-may-2026

Inside a single week in May, two of the largest AI labs claimed something that would have read as science fiction a year ago: that their systems had contributed *original* mathematics — not summarised it, not retrieved it, but produced results human mathematicians had not. The temptation is to score it as a race. The more useful question is how each result was *checked*, because that is what separates a genuine advance from a confident-sounding paragraph.

## What OpenAI did

On **20 May**, OpenAI said a general-purpose reasoning model found a construction disproving the **planar unit-distance conjecture**, a problem Paul Erdős posed in **1946**. The surprise was not only the answer but the route. Instead of refining the long-assumed grid arrangement, the model reached into algebraic number theory, establishing a **super-linear lower bound** (on the order of *n^1.014*). That is the kind of move that signals reasoning rather than recall — there was no prior answer to memorise, because none existed.

Crucially, this was a *general* reasoning model, not a maths-specific engine. External mathematicians — among them Fields medallist **Timothy Gowers** — checked the write-up before it reached arXiv. That human review is real evidence, but it is a softer guarantee than a machine proof: expert eyes can miss a subtle gap, which is precisely why formal peer review is still pending.

> OpenAI described the result as the first time a prominent open problem, central to a subfield of mathematics, has been solved autonomously by AI.
> — [OpenAI](https://openai.com/index/model-disproves-discrete-geometry-conjecture/), 2026-05-20

## What DeepMind did

Days later, Google DeepMind's **AlphaProof Nexus** — pairing **Gemini 3.1 Pro** with the **Lean** proof assistant — reported solving *nine* open Erdős problems (two unsolved for **56 years**) plus **44 conjectures** from the integer-sequence encyclopedia (OEIS), at a few hundred dollars of compute each. The system builds on DeepMind's earlier AlphaProof, which reached silver-medal level at the 2024 International Mathematical Olympiad; the jump here is from competition problems, solvable by talented humans in hours, to *research* problems with no such guarantee.

The two efforts took different paths to the same testbed, and the press framed it as a scoreboard — nine to one. That framing misses the point. The results are not the same *kind* of object, and the difference is the whole story.

> **Key:** **The throughline is verification.** DeepMind's proofs are checked line-by-line in **Lean** — a proof counts only if the formal checker accepts every step, the exact discipline a fluent-but-wrong chatbot lacks. The model writes in Lean's formal language; the compiler rejects any flawed step and feeds the error back. OpenAI's construction, by contrast, was *human*-verified and still awaits formal peer review. Same headline word — 'solved' — two very different guarantees.

## OpenAI vs DeepMind, side by side

The two milestones differ on every axis that matters for how much you should trust them. Set against each other:

| | OpenAI | Google DeepMind |
| --- | --- | --- |
| **What was claimed** | Disproved Erdős's 1946 unit-distance conjecture | Solved 9 open Erdős problems + 44 OEIS conjectures |
| **Approach** | General reasoning model; algebraic number theory | Gemini 3.1 Pro paired with the Lean proof assistant |
| **Verification** | Human-checked (incl. Timothy Gowers) | Machine-checked, every step, in Lean |
| **What is proven** | One construction; **peer review pending** | Each accepted proof is formally certified |
| **Main caveat** | Formal review still to come | Narrow domain; Hassabis: 'still not AGI' |

Read the bottom rows, not the count. A Lean certificate is a stronger object than a human read-through — which is why DeepMind's nine, individually less glamorous than disproving a famous conjecture, are in one sense the more solid result.

> Hassabis moved quickly to temper expectations, saying the system is 'still not AGI' even as it points toward a more practical role for AI in verified mathematical research.
> — [WinBuzzer](https://winbuzzer.com/2026/05/26/google-deepmind-says-alphaproof-nexus-is-still-not-agi-xcxwbn/), 2026-05-26

## Why now, and what to watch

Why this month? Because the late mathematician's hundreds of open problems, catalogued at erdosproblems.com, have become the field's favourite proving ground: easy to state, impossible to fake, with no memorised answer to crib. The honest read is that this is a genuine step — AI generating ideas a checker can certify — not a machine replacing mathematicians. The constraint that makes it trustworthy is the same one that keeps it grounded.

What to watch next is whether the *generate-then-verify* loop travels. A reliable pipeline of 'AI proposes, formal system certifies' neutralises AI's worst failure mode — confident wrongness — anywhere a claim can be mechanically checked. If that template spreads from maths to other formalisable corners of science, the lasting result of May 2026 will be the machinery, not any single proof.

## Key takeaways

- OpenAI (20 May) said a general reasoning model — not a maths-specific system — disproved a conjecture open since 1946, using algebraic number theory rather than the long-assumed grid.
- Days later, Google DeepMind's AlphaProof Nexus solved nine open Erdős problems plus 44 conjectures — a different method, for a few hundred dollars of compute each.
- The DeepMind result is machine-verified in Lean: a proof counts only if the formal checker accepts every step — exactly the discipline fluent-but-wrong models lack.
- OpenAI's was human-checked (Timothy Gowers among the readers) and still awaits formal peer review — a real but weaker guarantee than a Lean certificate.
- Both labs are careful: peer review is pending and Demis Hassabis called the system 'still not AGI'. The constraint that makes it trustworthy is what keeps it grounded.

## FAQ

### Did AI really solve maths problems humans couldn't?
It produced results open for decades — OpenAI a construction disproving a 1946 Erdős conjecture, DeepMind nine more Erdős problems plus 44 conjectures. OpenAI's is human-checked with formal peer review pending; DeepMind's are machine-verified in the Lean proof assistant.

### What is the Lean proof assistant, and why does it matter?
Lean is software that checks each logical step of a proof against mathematical axioms. It matters because AI can sound convincing while being wrong — a Lean-certified proof has survived a rigorous automated check, not just human intuition.

### Who actually 'won', OpenAI or DeepMind?
Wrong frame. OpenAI disproved one famous conjecture; DeepMind solved nine others by a different, machine-verified method. They are not the same task, so 'nine to one' is a headline, not a result — and DeepMind's formal certificates are arguably the sturdier object.

### Is this AGI?
No. DeepMind's Demis Hassabis said the system is 'still not AGI', as widely reported (WinBuzzer, 26 May 2026). It is a narrow instrument for verifiable problems, not general intelligence — impressive in its domain, far from human-level across the board.

### How much did it cost to run?
DeepMind reported solving each Erdős problem for a few hundred dollars of compute, per the arXiv preprint coverage — a striking efficiency point given two of the problems had been open for 56 years.

## Sources

- [An OpenAI model has disproved a central conjecture in discrete geometry](https://openai.com/index/model-disproves-discrete-geometry-conjecture/) — OpenAI, 2026-05-20
- [Advancing Mathematics Research with AI-Driven Formal Proof Search (AlphaProof Nexus preprint, arXiv:2605.22763)](https://arxiv.org/abs/2605.22763) — Google DeepMind / arXiv, 2026-05-21
- [OpenAI's milestone math breakthrough played to AI's strengths](https://www.understandingai.org/p/openais-milestone-math-breakthrough) — Understanding AI, 2026-05-22
- [Google DeepMind's AlphaProof Nexus Solves Erdős Problems as AI Math Race Moves Beyond Benchmarks](https://winbuzzer.com/2026/05/26/google-deepmind-says-alphaproof-nexus-is-still-not-agi-xcxwbn/) — WinBuzzer, 2026-05-26
