Debon Securities: Looking at the Road to the Reversal of Domestic Large Models from a Technical Perspective
05/07/2024
GMT Eight
Debon Securities released a research report stating that the overseas large model throne has undergone three changes, with leading companies taking turns to take the top spot in terms of performance. They compete in overall performance and multimodal interaction, and compare the upgrade of dimensions, from models to stacked terminals, to the cross-device user experience effect. At present, there is a large gap in science subjects, focusing on long texts, and domestically produced large models are showing signs of catching up with GPT. The three major difficulties of long texts are attention mechanism calculation complexity, context memory, and the difficulty of the longest text constraint. From a technical point of view, the decrease in training and inference costs drives prices down. Top companies reduce prices and iterate to improve competitiveness, paving the way for a comeback.
The overseas large model throne has undergone three changes, with leading companies taking turns to take the top spot in terms of performance, competing in overall performance and multimodal interaction.
The three changes of the large model throne: the first-generation GPT-4o underwent self-revolution, continuously refreshing comprehensive performance; the second-generation Google Gemini had more extreme context understanding and lower latency; the leader Claude 3.5 focuses on visual and interactive experience.
The competition for the high ground of large models: understanding and responding to multimodal interactions, competing in native multimodal technologies. The effect of large models depends on multimodal understanding and generation, with millisecond-level responses demonstrating more advanced visual and audio understanding capabilities, intelligent perception of tone and voice. End-to-end native multimodal technology, unified neural networks, are the main competition points.
The upgrade of comparison dimensions for large models: from models to stacked terminals, providing a cross-device user experience. For example, Google launched the AlAgent project Astra model, which can be used by pointing a mobile phone or eyeglasses at objects nearby and asking questions to Project Astra, almost instantly providing accurate answers.
The road to the domestic large model counterattack: focusing on long texts, reducing prices, and iterating to improve competitiveness.
Text first, then reasoning: there is currently a significant gap in science subjects, focusing on long texts, and domestically produced large models have shown signs of catching up with GPT, such as Tongyi Qianwen, KIMI, Shan Hai, and others.
The three major difficulties of long texts: attention mechanism calculation complexity, context memory, and the constraint problem of the longest text.
Commercially reducing prices and accelerating iterative development lead to the future. Leading companies such as Zhupu/ByteDance/Alibaba/Tencent/Baidu/iFlytek continuously iterate at lower prices, while startups such as Baichuan Intelligent/Moon Dark Side/Zero One Everything have not joined the price reduction trend. From a technical point of view, the decrease in training and inference costs drives prices down.
Investment advice: It is recommended to focus on (1) domestic large model manufacturers: Iflytek Co., Ltd., Sensetime, CloudWalk Technology, Grin Technologies, TRS Information Technology, Kunlun Tech, Dark Horse Technology Group, etc. (2) Application targets that access top large models: Beijing Kingsoft Office Software, Inc, Wondershare Technology Group, Fujian Foxit Software Development Joint Stock, ArcSoft Corporation, Richinfo Technology, Focus Technology, Shanghai Runda Medical Technology, Shenzhen Kingdom Sci-Tech, Weaver Network Technology, KINGDEE INT'L, and also pay attention to related targets of Kimi.
Risk Alert: Overseas large models are showing a trend towards closed-source, the technology gap of domestic large models is widening; domestic large models have not reached a commercially usable singularity in terms of overall performance, and the iteration speed is slowing down due to a lack of computational support; there are divergences in the technological roadmap of domestic large models, leading to uncertainty in future development direction.