Sinolink: The era of liquid cooling is coming. What are the benefiting sectors worth paying attention to?

date
31/08/2025
avatar
GMT Eight
There are many existing liquid cooling solutions, mainly including cold plate, immersion, and spray types.
Sinolink released a research report stating that NVIDIA will soon release its AI server NVIDIA DGX GB300, which is specially designed for the AI inference era. The extremely high chip computing power has further increased the thermal design power (TDP), making it difficult for traditional air cooling technology to meet the cooling requirements of higher TDP. The GB300 server adopts a completely liquid-cooled rack design, leading the new trend of liquid cooling for AI servers. According to IDC's forecast, the compound annual growth rate of the liquid-cooled server market in China is expected to reach 46.8% from 2024 to 2029, and the market size in 2029 will reach 16.2 billion US dollars. It is recommended to focus on the industrial chain links such as cold plate, immersion, and spray cooling. Sinolink's main views are as follows: What is the development path of liquid cooling technology? With the increase in single-chip and cabinet-level power densities, traditional air cooling solutions are not sufficient to cool equipment. It is generally believed that artificial intelligence clusters with a cabinet power density of over 20 kW are not suitable for air cooling solutions, and NVIDIA GB300 OEM/ODM manufacturers all adopt a fully liquid-cooled architecture. There are many existing liquid cooling solutions, mainly including cold plate, immersion, and spray cooling. Among these liquid cooling solutions, cold plate technology is relatively mature and widely used, but the cooling effect has a lower limit; immersion and spray cooling have excellent cooling effects, but they have higher economic costs and higher material requirements. What are the important industrial chain links of cold plate liquid cooling? Cold plate liquid cooling is the mainstream solution for large-scale and existing data center transformations. The principle is that the cooling liquid captures the heat from the chips through the cold plate, exchanges heat with the primary side in the CDU, and finally outputs the heat through cooling towers and other facilities to the external environment for heat dissipation. It usually consists of cold distribution unit CDU, cold plate, circulation pipeline, quick connector UQD, and manifold. 1) Cold Distribution Unit CDU: the "heart" of the cold plate liquid cooling system, which works by exchanging heat between the high-temperature cooling liquid absorbed by the secondary side and the primary side coolant, the main performance of the unit depends on the circulation pump (core parameters flow rate, head parameter, power efficiency ratio) and the performance of the heat exchanger. 2) Cold plate: the component that directly captures heat, transfers the heat from high heat-generating components such as GPUs in servers to the flowing cooling liquid through metal-based heat conducting materials. The flatness and micro-roughness of the bottom of the cold plate directly affect the thermal resistance of the interface with the chip, and the internal hollow channel structure determines the fluid dynamics performance and heat capture efficiency. The core indicators for evaluating the performance of the cold plate are pressure drop and thermal resistance, both of which are not only related to the microchannel structure design of the cold plate, but can also be optimized through phase change technology. 3) Cooling source: the last process where heat is transferred to the environment, it is the source of overall cooling. Data centers generally consider using natural cooling sources to save energy, but natural cooling sources are greatly influenced by local climate and resources, leading to uncertainties. In order to ensure the stable operation of the cooling system, a common mode of operation is to switch between mechanical cooling sources and natural cooling sources according to the local temperature in the data center and the inlet temperature requirements. What are the important industrial chain links of immersion liquid cooling? What are the changes compared to cold plate cooling? Immersion liquid cooling completely immerses the heat-generating electronic components in insulating cooling liquid, achieving rapid heat capture and conduction through the high heat capacity and thermal conductivity of the liquid. Its general architecture is consistent with cold plate cooling, with the secondary side consisting of a cooling pool (Tank), cooling liquid, heat exchanger (CDU), and circulation pump. Compared to cold plate cooling, immersion liquid cooling: 1) Has a simpler structure on the secondary side, relying only on a sealed immersion tank, circulation pump, and heat exchange unit to support the unified cooling of hundreds or thousands of nodes, without the need to customize cold plates, manifolds, quick connectors, and other components for high heat-generating components such as CPUs and GPUs. 2) Requires more careful selection of cooling liquid materials. Single-phase immersion liquid cooling requires high boiling point oil-based cooling liquids to prevent vaporization when the cooling liquid heats up, while two-phase immersion liquid cooling typically uses low boiling point fluorinated liquids. However, considering international environmental trends, data centers may be restricted from using large amounts of fluorinated liquids as cooling agents. 3) Has increased compatibility requirements with IT equipment in data centers, including traditional mechanical hard drives (HDD) and solid-state hard drives (SDD). Currently, HDDs are still the main storage solution in data centers, but HDDs cannot operate directly in cooling liquids, while SDDs have higher compatibility. What are the important industrial chain links of spray cooling? What are the changes compared to cold plate cooling? Spray cooling involves directly spraying cooling liquid onto the surfaces of high heat-generating components such as servers' CPUs and GPUs, or connected heat-conducting materials, to achieve efficient point-to-point heat capture and conduction. Its secondary side consists of a heat exchanger (CDU), manifold, spray-liquid cooling cabinet, and circulation pump. Compared to cold plate cooling, spray cooling: 1) Operates without phase change, with the cooling liquid circulating in liquid form at all times. Due to direct contact between the cooling liquid and the chips, a cooling agent with high boiling point, insulating, thermal conductivity, and anti-oxidation properties is required. 2) Allows for precise point-to-point placement, enabling precise design of the liquid distribution plate based on the location and heat output of server components, allowing the cooling liquid to be precisely sprayed onto the heat-generating components in a demand-based and controllable flow manner. What upstream directions are worth paying attention to? 1) Electronic Fluorinated Liquids: Due to the need for the cooling liquid to directly contact electronic components in immersion and spray cooling, high requirements for insulation and corrosion resistance, the current materials that meet these requirements are mainly electronic fluorinated liquids. These materials have excellent properties such as high insulation, non-flammability, low toxicity, non-corrosiveness, thermal stability, and good chemical stability. They can also adjust the boiling point based on different component ratios. 2) High-efficiency TIM Material Liquid Metal: For some low-power heat-generating components, the overall high-efficiency heat dissipation method of immersion cooling is not economical. It is predicted that the mainstream development direction of liquid cooling in the future will be to use immersion cooling for high-power CPUs and two-phase cold plates for other heat-generating components. The thermal interface material (TIM) applied between the heat sink device and heat-generating device greatly improves the heat dissipation performance of the two-phase cold plate. Due to the weak reliability of silicon materials and the occurrence of phase separation leading to a significant reduction in thermal conductivity, the main heat-conducting material for current electronic chips has been switched to phase-change materials. In some high-end electronic chip products, the highest thermal conductivity phase-change metal materials, such as indium, gallium, etc., are used. Liquid metal TIM materials are perfectly compatible with the thermal requirements of GPUs and CPUs, and can be used with both cold plate and immersion cooling. They are the most ideal thermal conductive material for high-end chips currently and one of the mainstream choices for chip thermal conductive materials in the future. What liquid cooling-related targets are worth paying attention to? 1) Liquid Cooling Total Solution Providers: Shenzhen Envicool Technology(002837.SZ), Guangdong Shenling Environmental Systems(301018.SZ), Sanhe Tongfei Refrigeration(300990.SZ), Guangzhou Goaland Energy Conservation Tech(300499.SZ), etc. 2) High value-added/greater localization rate of subordinate component suppliers: Shenzhen Cotran New Material(300731.SZ), Sichuan Chuanhuan Technology(300547.SZ), Feilong Auto Components(002536.SZ), Shenzhen FRD Science & Technology(300602.SZ), etc. 3) Upstream materials: Zhejiang Juhua(600160.SH), Shenzhen Capchem Technology(300037.SZ), Guangdong Hec Technology Holding(600673.SH), Zhejiang Yonghe Refrigerant(605020.SH), Darbond Technology(688035.SH), etc. Risk Warning Inadequate computing power demand, inadequate chip supply capacity, risk of international trade frictions; downstream demand growth falls short of expectations, technological development falls short of expectations, AI development falls short of expectations, project progress falls short of expectations, market competition intensifies, risk of safety production and environmental protection; AI commercial value falls short of expectations, high concentration in the supply chain, intensified industry regulation.