INT8
int 有损
Symmetric or asymmetric int8 quantization; widely supported
权重位数
bits/weight
8
激活位数
bits/activation
8
支持硬件
of total
38/39
实测案例
4
支持硬件 (38)
国产 (14)
海外
AMD Instinct MI300A AMD Instinct MI300X AMD Instinct MI325X AMD Instinct MI355X Apple M4 Max Neural Engine AWS Inferentia 2 AWS Trainium 2 Etched Sohu Google TPU v5p Google TPU Trillium (v6e) Groq LPU (TSP v1) Intel Gaudi 2 Intel Gaudi 3 NVIDIA A100 SXM4 80GB NVIDIA B200 SXM 180GB NVIDIA B300 SXM 288GB NVIDIA GB200 NVL72 NVIDIA GB300 NVL72 NVIDIA H100 SXM5 80GB NVIDIA H200 SXM 141GB NVIDIA L40S NVIDIA R200 SXM (Vera Rubin) SambaNova SN40L Tenstorrent Wormhole n300
使用此量化的案例 (4)
- Qwen3.6 Plus on 8× Cambricon MLU590 with LMDeploymlu590 ×8 · qwen3.6-plus · 380 tok/s
- GLM-5.1 on 8× Biren BR104 (export-control variant)br104 ×8 · glm-5.1 · 240 tok/s
- Gemma 4 on 4× MetaX 曦云 C500 with INT8metax-c500 ×4 · gemma-4 · 580 tok/s
- DeepSeek R1 on 16× Iluvatar 天垓 100 (Iluvatar IxRT)iluvatar-bi ×16 · deepseek-r1 · 220 tok/s