Signature by Liang Wenfeng, DeepSeek releases the latest paper.
After obtaining 50 billion in financing in mid-June, just over ten days later, on June 27th, the DeepSeek team, in collaboration with Peking University, released a paper titled "DSpark: Confidence-Scheduled Speculative Decoding with Semi-Autoregressive Generation." This was not just a simple iteration of the model version, but rather an addition of a speculative decoding module to the existing DeepSeek-V4-Pro and DeepSeek-V4-Flash models, with a focus on optimizing the engineering implementation. Alongside DSpark, DeepSpec, an all-in-one codebase for training and evaluating speculative decoding draft models, was open-sourced. DeepSpec includes data preparation tools, draft model implementations, training code, and evaluation scripts, all under the MIT license. Currently, DeepSpec supports three implementations: DSpark, DFlash, and Eagle3. It is worth noting that DeepSeek founder, Liang Wenfeng, is listed as one of the authors of the paper. Despite completing the first round of financing, the founder himself still personally participates in the writing of technical papers, which is uncommon in the AI industry.
Latest
6 m ago

