AMD MI355X Achieves Competitive TCO vs NVIDIA B200

Acceleratorsbenchmark update

AMD MI355X delivers lower TCO per million tokens than NVIDIA B200 for GPT-OSS 120B FP4 summarization at interactivity below 225 tok/s/user. MI300X also beats H100 on GPT-OSS 120B MX4 across all interactivity levels. B200 leads on LLaMA 70B FP4 and high-interactivity workloads.

89%

Overall ConfidenceBased on 5 weighted dimensions

Confidence Breakdown

Authority

30%

95Tier 1

Data Quality

25%

80Level 4

Recency

20%

4091-180 days

Corroboration

15%

1003 sources

Specificity

10%

100Level 4

Data Points(3)

Source: Inferencemax

MI355X beats B200 vLLM and B200 TRT-LLM on TCO for GPT-OSS 120B when interactivity < 225 tok/s/user
Oct 1, 2025
MI300X outperforms H100 on GPT-OSS 120B MX4 across entire interactivity range on TCO basis
Oct 1, 2025
B200 significantly outperforms MI355X on LLaMA 70B FP4 - AMD FP4 kernels need improvement
Oct 1, 2025

Related Components(3)

11111111 11111111 11111111

Stack

Tools

Registry

Training

Inference

Cost

AMD MI355X Achieves Competitive TCO vs NVIDIA B200

Confidence Breakdown

Data Points(3)

Related Components(3)