Yang Zhilin disclosed Kimi's technical route at GTC: betting on Token efficiency, long context, and Agent cluster.

18/03/2026

At the 2026 NVIDIA GTC conference, Yang Zhilin, the founder of Kimi on the dark side of the moon, gave a public speech. He stated that in order to continuously break through the upper limit of large model intelligence, it is necessary to reconstruct the underlying foundations such as optimizers, attention mechanisms, and residual connections. Following the official release of Kimi K2.5 at the end of January this year, Yang Zhilin systematically disclosed the technical roadmap behind the model for the first time in his speech. He summarized the evolution logic of Kimi into three dimensions of resonance: Token efficiency, long context, and intelligent agent cluster. "Current scaling is no longer just about stacking resources, but also about simultaneously seeking economies of scale in computing efficiency, long-term memory, and automated collaboration. If we can multiply the technical gains from these three dimensions, the model will demonstrate intelligence far beyond the current state." In addition, he predicts that the future form of intelligence will evolve from a single intelligent agent to dynamically generated clusters.