Showcase · what the data tells us
Insights auto-computed from the data corpus, refreshed every build
Every insight here is computed live from the corpus — adding a new case automatically refreshes these numbers.
Gemma 4 26B on 4× H100 SXM with FP8
h100-sxm5 ×4 · assumes $2.5/h/card + $0.10/kWh + PUE 1.3
AMD Instinct MI300X
measured/theoretical ratio across 1 cases. Reaching 150% of the theoretical roofline.
昇腾 910C
0% of theoretical — large headroom for kernel/op-library tuning. Common pattern for Chinese silicon: stacks like CANN/MUSA/MindIE close this gap year over year.
NVIDIA H100 SXM5 80GB
The data flywheel is spinning — this card has the most independent reproductions logged.
昇腾 910C
Currently 0.00 vs overseas mean 1.38. Each +0.05 in efficiency ≈ +10% effective hardware throughput.
NVIDIA L40S → NVIDIA B200 SXM 180GB
366 → 2250 TFLOPS BF16 · 2023 → 2024
DeepSeek R1
Logged on 3 different accelerators — the most deployment-friendly frontier model in the corpus.
Google TPU v5p
ICI · 4800 GB/s. Holds an entire frontier MoE in a single scale-up domain.