SHARE

OpenAI Model Wins Gold at International Mathematical Olympiad – or Did It?

Image: X/@alexwei_

A Google DeepMind researcher and OpenAI’s former CTO are posing questions about the validity of OpenAI’s claim about its gold-medal score.

Written By

Fiona Jackson

Jul 21, 2025

OpenAI’s latest model has achieved a gold-level score at the 2025 International Mathematical Olympiad. It answered five out of the six questions under exam conditions, scoring 35 out of a possible 42 points.

The International Mathematical Olympiad is known to be the most prestigious and challenging mathematics competition for high school students in the world. Only about 10% of this year’s competitors received gold medals, and numerous Fields Medalists have won it in the past. Each competitor has two 4.5-hour sessions to complete the six questions without access to the internet or any tools.

AI models’ mixed success at solving math problems

Artificial intelligence models are not known to excel at complex mathematical problems because they can struggle to understand logic. And yet, recently, Gemini 2.5 Pro and OpenAI’s o3 scored 86.7% and 88.9%, respectively, in the American Invitational Mathematics Examination, a key math benchmark for AI models. In contrast, in September 2024, o1 scored 83% in just a qualifying exam for the International Olympiad. And, Grok 4 reportedly got a perfect 100% on AIME (math olympiad problems).

“IMO problems demand a new level of sustained creative thinking compared to past benchmarks,” OpenAI researcher Alexander Wei posted on X after announcing the unreleased model’s milestone. His colleague, Noam Brown, said that just last year, AI labs were using grade school math as a benchmark, referring to the GSM8K test.

OpenAI CEO Sam Altman said the experimental model was “an LLM doing math and not a specific formal math system” like AlphaGeometry, indicating that the company is well on its way to achieving general intelligence.

Manon Bischoff, an editor at the German-language version of Scientific American, predicted in January 2024 that it would be “a few years” before AI models could conceivably compete in the International Math Olympiad; however, AI models are improving quickly. At the time, Bischoff was announcing the release of the math-specific model AlphaGeometry, which could solve 54% of all the geometry questions included in the competition between 2000 and 2024; as of February 2025, a second-generation version could solve 84% of them.

Questions arise about OpenAI’s gold medal at IMO

Not everyone is convinced of OpenAI’s leaps and bounds in mathematical capabilities.

According to Google DeepMind researcher Thang Luong and OpenAI’s former CTO Mikhail Samin, OpenAI’s model was not graded based on the International Mathematical Olympiad’s official guidelines, and thus its claims to be a gold medallist are not verifiable. Wei said on X that “three former IMO medalists independently graded the model’s submitted proof” and reached “unanimous consensus” on their scores.

OpenAI doesn’t have the strongest reputation when it comes to benchmarking the mathematical ability of its models. In April, Epoch AI, the independent research institute behind the FrontierMath benchmark, found that the o3 model could correctly answer only about 10% of the advanced problems, a steep decline from the over 25% accuracy originally claimed by OpenAI in December 2024.

It will be difficult for anyone to conduct the same level of independent verification on the experimental model that took part in the Olympiad until it is released. Unfortunately, Wei confirmed that OpenAI does not “plan to release anything with this level of math capability for several months,” and as GPT-5 is coming “soon,” it’s unlikely that this experimental system will be part of that release.

Mathematical ability is clearly an important quality for OpenAI. Last month, OpenAI released the o3-pro model, which the company dubbed its most intelligent yet.

Fiona Jackson

Fiona Jackson is a news writer who started her journalism career at SWNS press agency, later working at MailOnline, an advertising agency, and TechnologyAdvice. Her work spans human interest and consumer tech reporting, appearing in prominent media outlets such as TechHQ, The Independent, Daily Mail, and The Sun.

The tech headlines and insights trusted by 230K+ subscribers.

Daily tech news, product updates, and curated resources that keep professionals informed on the trends shaping technology today. Delivered daily.