Llama 3.3 70B Instruct

meta DENSE text Llama-3.3-Community 2024-12-06

架构

Total params
70 B
Active params
70 B
Layers
80
Context
128 k

详细规格

Hidden size
8192
FFN size
28672
Attention heads
64
KV heads
8
Head dim
128
Vocab size
128256
Attention type
gqa

算子拆解 (per token)

算子 FLOPs / token Bytes / token
matmul 1.32e+11 1.32e+11
attention 7.50e+9 7.50e+9
rmsnorm 8.40e+6 1.31e+6
rope 5.24e+5 6.55e+4
silu 1.57e+8 7.86e+7

兼容硬件