GLM-5.1

zhipu MOE text MIT 2026-04-07

架构

Total params
754 B
Active params
32 B
Layers
64
Context
128 k

详细规格

Hidden size
6144
FFN size
16384
Attention heads
48
KV heads
8
Head dim
128
Vocab size
151552
Attention type
mha
MoE experts
192
MoE top-k
8
Expert hidden
1536

算子拆解 (per token)

算子 FLOPs / token Bytes / token
matmul 3.87e+10 3.87e+10
attention 1.29e+10 2.04e+10
moe-gate 7.55e+7 9.66e+9
rmsnorm 3.93e+6 1.57e+6

兼容硬件