Gemma 4 26B

google MOE text vision Gemma-License 2026-02-20

架构

Total params
26 B
Active params
4 B
Layers
32
Context
128 k

详细规格

Hidden size
4096
FFN size
11264
Attention heads
32
KV heads
8
Head dim
128
Vocab size
256000
Attention type
gqa
MoE experts
16
MoE top-k
2
Expert hidden
5632

算子拆解 (per token)

算子 FLOPs / token Bytes / token
matmul 8.86e+9 8.86e+9
attention 3.22e+9 4.83e+9
moe-gate 2.10e+6 2.95e+9
rmsnorm 1.31e+6 5.24e+5

兼容硬件