<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"><channel><title>EvoKernel Spec — 部署案例</title><description>AI 推理部署案例 (硬件 × 模型 × 引擎) 实测数据流</description><link>https://evokernel.dev/</link><language>zh-CN</language><item><title>DeepSeek R1 on 16× Ascend 910B with MindIE</title><link>https://evokernel.dev/cases/case-dsr1-asc910bx16-mindie-001/</link><guid isPermaLink="true">https://evokernel.dev/cases/case-dsr1-asc910bx16-mindie-001/</guid><description>ascend-910b ×16 · deepseek-r1 · mindie · bf16 · decode 850 tok/s · TTFT p50 280ms · 瓶颈 memory-bandwidth</description><pubDate>Tue, 28 Apr 2026 00:00:00 GMT</pubDate></item><item><title>DeepSeek V4 Flash on 8×H100 SXM with vLLM FP8</title><link>https://evokernel.dev/cases/case-dsv4-flash-h100x8-vllm-fp8-001/</link><guid isPermaLink="true">https://evokernel.dev/cases/case-dsv4-flash-h100x8-vllm-fp8-001/</guid><description>h100-sxm5 ×8 · deepseek-v4-flash · vllm · fp8-e4m3 · decode 4200 tok/s · TTFT p50 220ms · 瓶颈 memory-bandwidth</description><pubDate>Tue, 28 Apr 2026 00:00:00 GMT</pubDate></item><item><title>DeepSeek V4 Pro on Huawei CloudMatrix 384 with MindIE</title><link>https://evokernel.dev/cases/case-dsv4pro-cm384-mindie-001/</link><guid isPermaLink="true">https://evokernel.dev/cases/case-dsv4pro-cm384-mindie-001/</guid><description>ascend-910c ×384 · deepseek-v4-pro · mindie · bf16 · decode 2400 tok/s · TTFT p50 380ms · 瓶颈 memory-bandwidth</description><pubDate>Tue, 28 Apr 2026 00:00:00 GMT</pubDate></item><item><title>Llama 3.3 70B on 8× A100 SXM4 80GB with vLLM</title><link>https://evokernel.dev/cases/case-llama33-a100x8-vllm-001/</link><guid isPermaLink="true">https://evokernel.dev/cases/case-llama33-a100x8-vllm-001/</guid><description>a100-sxm4 ×8 · llama-3.3-70b · vllm · bf16 · decode 1480 tok/s · TTFT p50 220ms · 瓶颈 memory-bandwidth</description><pubDate>Tue, 28 Apr 2026 00:00:00 GMT</pubDate></item><item><title>Llama 4 Scout on 8×H100 SXM with vLLM (public benchmark)</title><link>https://evokernel.dev/cases/case-llama4-scout-h100x8-vllm-001/</link><guid isPermaLink="true">https://evokernel.dev/cases/case-llama4-scout-h100x8-vllm-001/</guid><description>h100-sxm5 ×8 · llama-4-scout · vllm · bf16 · decode 1850 tok/s · TTFT p50 145ms · 瓶颈 memory-bandwidth</description><pubDate>Tue, 28 Apr 2026 00:00:00 GMT</pubDate></item><item><title>Qwen2.5-Coder 32B on 4× L40S with vLLM (FP8)</title><link>https://evokernel.dev/cases/case-qwencoder-l40sx4-vllm-001/</link><guid isPermaLink="true">https://evokernel.dev/cases/case-qwencoder-l40sx4-vllm-001/</guid><description>l40s ×4 · qwen2.5-coder-32b · vllm · fp8-e4m3 · decode 580 tok/s · TTFT p50 480ms · 瓶颈 memory-bandwidth</description><pubDate>Tue, 28 Apr 2026 00:00:00 GMT</pubDate></item><item><title>DeepSeek V4 Flash with disaggregated prefill (H100) + decode (H200) via Mooncake</title><link>https://evokernel.dev/cases/case-dsv4flash-disagg-h100-h200-001/</link><guid isPermaLink="true">https://evokernel.dev/cases/case-dsv4flash-disagg-h100-h200-001/</guid><description>h200-sxm ×16 · deepseek-v4-flash · sglang · fp8-e4m3 · decode 9600 tok/s · TTFT p50 320ms · 瓶颈 memory-bandwidth</description><pubDate>Mon, 27 Apr 2026 00:00:00 GMT</pubDate></item><item><title>GLM-5.1 on 8× H200 SXM with vLLM BF16</title><link>https://evokernel.dev/cases/case-glm51-h200x8-vllm-001/</link><guid isPermaLink="true">https://evokernel.dev/cases/case-glm51-h200x8-vllm-001/</guid><description>h200-sxm ×8 · glm-5.1 · vllm · bf16 · decode 2400 tok/s · TTFT p50 280ms · 瓶颈 memory-bandwidth</description><pubDate>Sun, 26 Apr 2026 00:00:00 GMT</pubDate></item><item><title>Qwen3.6 Plus on 8× MI325X with SGLang FP8</title><link>https://evokernel.dev/cases/case-qwen36-mi325x8-sglang-001/</link><guid isPermaLink="true">https://evokernel.dev/cases/case-qwen36-mi325x8-sglang-001/</guid><description>mi325x ×8 · qwen3.6-plus · sglang · fp8-e4m3 · decode 3100 tok/s · TTFT p50 240ms · 瓶颈 memory-bandwidth</description><pubDate>Sun, 26 Apr 2026 00:00:00 GMT</pubDate></item><item><title>Llama 4 Maverick on TPU Trillium (v6e) 256-chip pod</title><link>https://evokernel.dev/cases/case-llama4mvk-trillium-256-001/</link><guid isPermaLink="true">https://evokernel.dev/cases/case-llama4mvk-trillium-256-001/</guid><description>trillium ×256 · llama-4-maverick · vllm · bf16 · decode 5800 tok/s · TTFT p50 180ms · 瓶颈 compute</description><pubDate>Sat, 25 Apr 2026 00:00:00 GMT</pubDate></item><item><title>Llama 4 Scout on 8× Hygon DCU K100 with vLLM</title><link>https://evokernel.dev/cases/case-llama4scout-dcuk100x8-001/</link><guid isPermaLink="true">https://evokernel.dev/cases/case-llama4scout-dcuk100x8-001/</guid><description>dcu-k100 ×8 · llama-4-scout · vllm · bf16 · decode 850 tok/s · TTFT p50 320ms · 瓶颈 software</description><pubDate>Sat, 25 Apr 2026 00:00:00 GMT</pubDate></item><item><title>Qwen3.5 397B Reasoning on 8× MI355X with FP4</title><link>https://evokernel.dev/cases/case-qwen35-397b-mi355x8-001/</link><guid isPermaLink="true">https://evokernel.dev/cases/case-qwen35-397b-mi355x8-001/</guid><description>mi355x ×8 · qwen3.5-397b · vllm · fp4 · decode 4500 tok/s · TTFT p50 220ms · 瓶颈 memory-bandwidth</description><pubDate>Fri, 24 Apr 2026 00:00:00 GMT</pubDate></item><item><title>DeepSeek V4 Flash on 16× MTT S4000 (Moore Threads KUAE)</title><link>https://evokernel.dev/cases/case-dsv4flash-mtts4000x16-001/</link><guid isPermaLink="true">https://evokernel.dev/cases/case-dsv4flash-mtts4000x16-001/</guid><description>mtt-s4000 ×16 · deepseek-v4-flash · vllm · fp16 · decode 320 tok/s · TTFT p50 540ms · 瓶颈 software</description><pubDate>Thu, 23 Apr 2026 00:00:00 GMT</pubDate></item><item><title>Kimi K2.6 on 16× Cambricon MLU590 (with vLLM port)</title><link>https://evokernel.dev/cases/case-kimik26-mlu590x16-001/</link><guid isPermaLink="true">https://evokernel.dev/cases/case-kimik26-mlu590x16-001/</guid><description>mlu590 ×16 · kimi-k2.6 · vllm · bf16 · decode 480 tok/s · TTFT p50 460ms · 瓶颈 software</description><pubDate>Wed, 22 Apr 2026 00:00:00 GMT</pubDate></item><item><title>Llama 4 Scout on 8× MI300X with vLLM BF16</title><link>https://evokernel.dev/cases/case-llama4scout-mi300x8-001/</link><guid isPermaLink="true">https://evokernel.dev/cases/case-llama4scout-mi300x8-001/</guid><description>mi300x ×8 · llama-4-scout · vllm · bf16 · decode 2200 tok/s · TTFT p50 158ms · 瓶颈 memory-bandwidth</description><pubDate>Wed, 22 Apr 2026 00:00:00 GMT</pubDate></item><item><title>Qwen3.6 Plus on 8× Cambricon MLU590 with LMDeploy</title><link>https://evokernel.dev/cases/case-qwen36plus-mlu590x8-001/</link><guid isPermaLink="true">https://evokernel.dev/cases/case-qwen36plus-mlu590x8-001/</guid><description>mlu590 ×8 · qwen3.6-plus · lmdeploy · int8 · decode 380 tok/s · TTFT p50 580ms · 瓶颈 software</description><pubDate>Wed, 22 Apr 2026 00:00:00 GMT</pubDate></item><item><title>Gemma 4 26B on 4× H100 SXM with FP8</title><link>https://evokernel.dev/cases/case-gemma4-h100x4-fp8-001/</link><guid isPermaLink="true">https://evokernel.dev/cases/case-gemma4-h100x4-fp8-001/</guid><description>h100-sxm5 ×4 · gemma-4 · tensorrt-llm · fp8-e4m3 · decode 6800 tok/s · TTFT p50 95ms · 瓶颈 compute</description><pubDate>Tue, 21 Apr 2026 00:00:00 GMT</pubDate></item><item><title>GLM-5.1 on 8× Biren BR104 (export-control variant)</title><link>https://evokernel.dev/cases/case-glm51-br104x8-001/</link><guid isPermaLink="true">https://evokernel.dev/cases/case-glm51-br104x8-001/</guid><description>br104 ×8 · glm-5.1 · vllm · int8 · decode 240 tok/s · TTFT p50 720ms · 瓶颈 software</description><pubDate>Mon, 20 Apr 2026 00:00:00 GMT</pubDate></item><item><title>GPT-OSS on 8× Intel Gaudi 3 with vLLM</title><link>https://evokernel.dev/cases/case-gptoss-gaudi3x8-001/</link><guid isPermaLink="true">https://evokernel.dev/cases/case-gptoss-gaudi3x8-001/</guid><description>gaudi-3 ×8 · gpt-oss · vllm · fp8-e4m3 · decode 2900 tok/s · TTFT p50 140ms · 瓶颈 memory-bandwidth</description><pubDate>Mon, 20 Apr 2026 00:00:00 GMT</pubDate></item><item><title>DeepSeek V3 on AWS Trainium 2 (64-chip Trn2 instance)</title><link>https://evokernel.dev/cases/case-dsv3-trainium2-x64-001/</link><guid isPermaLink="true">https://evokernel.dev/cases/case-dsv3-trainium2-x64-001/</guid><description>trainium-2 ×64 · deepseek-r1 · vllm · bf16 · decode 3600 tok/s · TTFT p50 320ms · 瓶颈 memory-bandwidth</description><pubDate>Sun, 19 Apr 2026 00:00:00 GMT</pubDate></item><item><title>Gemma 4 on 4× MetaX 曦云 C500 with INT8</title><link>https://evokernel.dev/cases/case-gemma4-c500x4-001/</link><guid isPermaLink="true">https://evokernel.dev/cases/case-gemma4-c500x4-001/</guid><description>metax-c500 ×4 · gemma-4 · vllm · int8 · decode 580 tok/s · TTFT p50 420ms · 瓶颈 memory-bandwidth</description><pubDate>Sat, 18 Apr 2026 00:00:00 GMT</pubDate></item><item><title>DeepSeek R1 on 16× Iluvatar 天垓 100 (Iluvatar IxRT)</title><link>https://evokernel.dev/cases/case-dsr1-tianhe100x16-001/</link><guid isPermaLink="true">https://evokernel.dev/cases/case-dsr1-tianhe100x16-001/</guid><description>iluvatar-bi ×16 · deepseek-r1 · lmdeploy · int8 · decode 220 tok/s · TTFT p50 980ms · 瓶颈 software</description><pubDate>Wed, 15 Apr 2026 00:00:00 GMT</pubDate></item></channel></rss>