Peking University and DeepSeek jointly open-source the DSpark framework, improving generation speed by over 60% under high concurrency.

27/06/2026

DeepSeek officially releases the DSpark inference acceleration framework in collaboration with Peking University, aiming to solve the efficiency bottleneck of large language models in high-concurrency production environments. This framework has been deployed in the preview version service engines of DeepSeek-V4-Flash and DeepSeek-V4-Pro. Compared to the single-token inference baseline MTP-1 used in previous production environments, it can increase the single-user generation speed by 60% to 85% at the same throughput level. The related papers, training code, and model checkpoints have been open-sourced in the DeepSpec project on the GitHub platform.