Anthropic's Claude Opus 4.5 has achieved a new state-of-the-art score of 93.2% on MMLU, surpassing GPT-5.1 and demonstrating significant advances in broad knowledge understanding.
Claude Opus 4.5 achieved 93.2% on MMLU, up from 90.8%
Dec 20, 2025Performance gains concentrated in STEM and reasoning categories
Dec 20, 2025HumanEval coding benchmark also improved to 92.1%
Dec 19, 2025