This repo is a collection and evaluation of solutions to the 2025 International Mathematical Olympiad (IMO) problems, with a focus on large language model (LLM) performance. It includes the official IMO 2025 problems, detailed solutions and reasoning generated by various LLMs, and a structured evaluation of their correctness and completeness.
Only two models, Bytedance Seed 1.6 and Google Gemini 2.5 Pro, answered Problem 5 both correctly and completely.
Token Count per Problem
Cost per Problem
You can use this repository to:
- Review the IMO 2025 problems and their solutions by different LLMs.
- Extend the dataset with new models or additional analysis.
- OpenAI models (o3-medium, o4-mini-high): Used the response API with default parameters.
- Deepseek R1: Used official recommended parameters:
temperature=0.6
,top_p=0.95
. - All other models: Used
temperature=0
,top_p=1
.