Qwen3.6 Plus

alibaba MOE text vision Apache-2.0 2026-03-25

架构

Total params
480 B
Active params
35 B
Layers
64
Context
1024 k

详细规格

Hidden size
6144
FFN size
16384
Attention heads
64
KV heads
8
Head dim
128
Vocab size
152064
Attention type
gqa
MoE experts
128
MoE top-k
8
Expert hidden
2048

算子拆解 (per token)

算子 FLOPs / token Bytes / token
matmul 3.87e+10 3.87e+10
attention 1.29e+10 2.01e+10
moe-gate 5.03e+7 1.29e+10
rmsnorm 3.93e+6 1.57e+6

兼容硬件