Alibaba open-weights model with 128K context. 72B parameters. Strong math (90/100) and instruction following (90/100). Top performer in open-weights category.
Epoch Capabilities Index score
Normalized ECI score (0-100)
Maximum input token capacity
Total compute used during training
Code generation, understanding, and debugging
Mathematical reasoning and problem solving
Multi-step logical reasoning
Overall reasoning and task completion ability
| Dimension | Score |
|---|---|
| Intelligence | 80.0 |
| Reasoning | 78.0 |
| Math | 81.0 |
| Code | 79.0 |
| Instruction Following | 81.0 |
| Overall | 80 |
Alibaba's gte-Qwen2-7B-instruct has claimed the top position on MTEB with ~70% overall score, excelling in retrieval (nDCG@10) and semantic textual similarity tasks. The 7B parameter model with 3584 dimensions outperforms NVIDIA's NV-Embed-v2 and Voyage AI's voyage-3-large.
Text embedding models have emerged as a distinct category in the AI stack, with MTEB standardizing evaluation across 8 task types. The top performers (gte-Qwen2, NV-Embed-v2, voyage-3-large) achieve ~70% overall scores with vector dimensions ranging from 768 to 4096, enabling specialized retrieval and semantic search applications.