Llama 3.3 70B

Models

Meta's open-source foundation model with broad deployment

Most deployed open-source model

Metrics

elo1,220

providerMeta

hf average44.85

parameters70.554

price input$0.40

architectureLlamaForCausalLM

price output$0.40

context window128,000

Score Breakdown

code80

math48

reasoning57

intelligence48

instruction following90

Compatibility

1111111195 1111111195 1111111190

Scoring Methodology

intelligence30% weight

Overall reasoning and task completion ability

Source: LMArena ELO, Artificial Analysis Intelligence Index, HuggingFace MMLU-PRO

math20% weight

Mathematical reasoning and problem solving

Source: MATH benchmark, GSM8K, HuggingFace MATH-Lvl5

code20% weight

Code generation, understanding, and debugging

Source: HumanEval, MBPP, SWE-bench

reasoning15% weight

Multi-step logical reasoning

Source: ARC-Challenge, BBH, MMLU-Pro, HuggingFace BBH

instruction_following15% weight

Ability to follow complex instructions accurately

Source: HuggingFace IFEval

Related Signals

DeepSeek R1 Leads Open-Source Math Benchmarks

Models1d ago

DeepSeek R1 has emerged as the leading open-source model for mathematical reasoning, outperforming many closed-source alternatives on MATH and GSM8K benchmarks.

Data Sources

lmarena.ai

openrouter.ai

Last updated: December 24, 2025