Kimi K2.6

moonshot MOE text vision moonshot-license 2026-04-15

架构

Total params
1000 B
Active params
32 B
Layers
60
Context
256 k

详细规格

Hidden size
7168
FFN size
18432
Attention heads
64
KV heads
8
Head dim
128
Vocab size
160000
Attention type
mla
MoE experts
384
MoE top-k
8
Expert hidden
1536

算子拆解 (per token)

算子 FLOPs / token Bytes / token
matmul 4.76e+10 4.76e+10
attention 1.59e+10 2.55e+10
moe-gate 1.65e+8 1.06e+10
rmsnorm 4.30e+6 1.72e+6

兼容硬件