NVIDIA B200 delivers ~3x power efficiency vs H100 for GPT-OSS 120B FP4 (2.8M vs 900K tok/s/MW). Similar gains seen in AMD CDNA3 to CDNA4 (MI355X 3x better than MI300X). Blackwell 20% more energy efficient than MI355X due to lower TDP (1kW vs 1.4kW).
B200 processes 2.8M tok/s/MW vs H100 at 900K tok/s/MW - 3x improvement
Oct 1, 2025MI355X processes 2.55M tok/s/MW vs MI300X at 750K - 3x CDNA generational improvement
Oct 1, 2025Blackwell 20% more energy efficient than CDNA4 (MI355X TDP 1.4kW vs B200 1kW)
Oct 1, 2025