NVIDIA
英伟达
加速卡
9
服务器/Pod
5
部署案例
7
加速卡 (9)
NVIDIA R200 SXM (Vera Rubin)
- BF16
- 7500 TF
- Mem
- 288 GB
- Year
- 2026
NVIDIA B300 SXM 288GB
- BF16
- 3750 TF
- Mem
- 288 GB
- Year
- 2025
NVIDIA GB300 NVL72
- BF16
- 3750 TF
- Mem
- 288 GB
- Year
- 2025
NVIDIA B200 SXM 180GB
- BF16
- 2250 TF
- Mem
- 180 GB
- Year
- 2024
NVIDIA GB200 NVL72
- BF16
- 2250 TF
- Mem
- 192 GB
- Year
- 2024
NVIDIA H200 SXM 141GB
- BF16
- 989 TF
- Mem
- 141 GB
- Year
- 2024
NVIDIA L40S
- BF16
- 366 TF
- Mem
- 48 GB
- Year
- 2023
NVIDIA H100 SXM5 80GB
- BF16
- 989 TF
- Mem
- 80 GB
- Year
- 2022
NVIDIA A100 SXM4 80GB
- BF16
- 312 TF
- Mem
- 80 GB
- Year
- 2020
服务器与超节点 (5)
NVIDIA GB200 NVL72 Rack
super-pod · 72 cards · scale-up 72
NVIDIA GB300 NVL72 Rack
super-pod · 72 cards · scale-up 72
NVIDIA DGX A100 8-GPU
integrated-server · 8 cards · scale-up 8
NVIDIA HGX H100 8-GPU
integrated-server · 8 cards · scale-up 8
NVIDIA HGX H200 8-GPU
integrated-server · 8 cards · scale-up 8
部署案例 (7)
- DeepSeek V4 Flash on 8×H100 SXM with vLLM FP8h100-sxm5 ×8 · deepseek-v4-flash · 4200 tok/s
- Llama 3.3 70B on 8× A100 SXM4 80GB with vLLMa100-sxm4 ×8 · llama-3.3-70b · 1480 tok/s
- Llama 4 Scout on 8×H100 SXM with vLLM (public benchmark)h100-sxm5 ×8 · llama-4-scout · 1850 tok/s
- Qwen2.5-Coder 32B on 4× L40S with vLLM (FP8)l40s ×4 · qwen2.5-coder-32b · 580 tok/s
- DeepSeek V4 Flash with disaggregated prefill (H100) + decode (H200) via Mooncakeh200-sxm ×16 · deepseek-v4-flash · 9600 tok/s
- GLM-5.1 on 8× H200 SXM with vLLM BF16h200-sxm ×8 · glm-5.1 · 2400 tok/s
- Gemma 4 26B on 4× H100 SXM with FP8h100-sxm5 ×4 · gemma-4 · 6800 tok/s