Qwen2.5-Coder 32B Instruct

alibaba DENSE text Apache-2.0 2024-11-12

架构

Total params
32 B
Active params
32 B
Layers
64
Context
128 k

详细规格

Hidden size
5120
FFN size
27648
Attention heads
40
KV heads
8
Head dim
128
Vocab size
152064
Attention type
gqa

算子拆解 (per token)

算子 FLOPs / token Bytes / token
matmul 6.05e+10 6.05e+10
attention 3.50e+9 3.50e+9
rmsnorm 6.55e+6 8.19e+5
rope 3.28e+5 4.10e+4
silu 1.22e+8 6.08e+7

兼容硬件