FP4
fp4- 权重位数
- 4
- 激活位数
- 8
- 支持硬件
- 8 / 39
- 已用案例
- 1
Microscaling FP4; introduced on Blackwell B200/B300 and AMD MI355X for inference
9 种量化方案 · 硬件支持 · 案例引用
Microscaling FP4; introduced on Blackwell B200/B300 and AMD MI355X for inference
Activation-aware Weight Quantization; weight-only int4
Generative Pre-trained Transformer Quantization; second-order weight-only int4
Weight 4-bit / activation 16-bit; generic name for AWQ and GPTQ family
4-bit exponent, 3-bit mantissa; preferred for activations due to dynamic range
5-bit exponent, 2-bit mantissa; preferred for gradients/weights with wider range
Symmetric or asymmetric int8 quantization; widely supported
Brain float 16; 8-bit exponent, 7-bit mantissa; default training precision since 2020
IEEE 754 half precision; 5-bit exponent, 10-bit mantissa