Back to Benchmarks

VideoMME

Video understanding and multimodal evaluation tasks

Frontier
Category:specialized
EDI:106.9
Slope:0.38
View Source

Leaderboard

(8 models)
RankModelScoreStderr
1Gemini 1.5 Flash75.00
2Qwen2.5-Max73.50
3GPT-4o71.90
4gpt-4o-mini-2024-07-1864.80
5Claude 3.7 Sonnet60.00
6GPT-4.159.90
7Kimi K2 0905 (Novita)55.80
8Qwen Plus51.30

Data source: Epoch AI, “Data on AI Benchmarking”. Published at epoch.ai

Licensed under CC-BY 4.0

VideoMME: Top Score 75.0% - AI Benchmark | NeoSignal