AI chip frenzy squeezes supply of supercomputers: US national laboratories seek new alternatives, giving rise to start-ups.
In recent years, with the explosive growth of artificial intelligence (AI) technology, mainstream chip giants have shifted their R&D focus and production capacity towards profitable low-precision AI computing. However, this strategic shift is causing an unexpected chain reaction - US national laboratories, finding it difficult to procure chips that meet the high-precision scientific computing needs, are beginning to turn their attention to emerging chip startups.
In recent years, with the explosive growth of artificial intelligence (AI) technology, mainstream chip giants such as NVIDIA Corporation (NVDA.US) and AMD (AMD.US) have been shifting their research and development focus and production capacity towards the lucrative low-precision AI computing. However, this strategic shift is causing an unexpected chain reaction - US national laboratories are having difficulty sourcing chips that meet their high-precision scientific computing needs, prompting them to turn their attention to emerging chip startups. According to reports, Sandia National Laboratories at Kirtland Air Force Base in New Mexico is testing chips from the Israeli startup company NextSilicon in search of a new path to overcome supply chain difficulties.
Big companies are turning to AI, leaving high-precision computing needs neglected
Sandia National Laboratory is one of the three major laboratories responsible for nuclear weapons research and maintenance in the United States. The liquid-cooled supercomputer at its facility is constantly tasked with extremely complex simulation tasks - from simulating the trajectory of hypersonic nuclear weapons passing through the atmosphere to simulating scenarios of a nuclear warhead detonating near another warhead. Over the past decade, chips used to handle these highly confidential and demanding tasks primarily came from mainstream semiconductor companies like NVIDIA Corporation and AMD.
However, Steve Monk, head of the high-performance computing team at Sandia National Laboratory, stated that as mainstream chip companies increasingly design their products towards AI and face supply chain shortages, the laboratory is facing increasing pressure to acquire chips that meet its high-precision scientific computing needs. From supply chain to computing power, the dual pressure is causing concerns about the team's ability to deliver future tasks.
The core disagreement lies in a technology metric called "double-precision floating-point calculation" (FP64). For scientific calculations such as nuclear weapon physics simulations, chips need to be able to simultaneously process extremely large and extremely small numbers without loss of accuracy. For many years, NVIDIA Corporation and AMD have been striving for a leading position in accelerating such calculations, winning numerous supercomputing contracts from universities and government laboratories. However, AI training and inference work do not rely on double-precision calculations, causing a tilt in the balance of chip design.
FP64 is a key technology that sustains modern aircraft flight, rocket launches, vaccine development, and even the normal operation of nuclear weapons, capable of representing over 18.44 quintillion unique values, serving as the "gold standard" in the field of scientific computing. In contrast, modern AI models typically use FP8 precision for training, capable of representing only 256 unique values.
While NVIDIA Corporation's latest Rubin GPU has achieved a qualitative leap in AI computing power - with an inference speed of 50 petaFLOPS, 2.5 times that of the previous Blackwell generation - its FP64 peak performance is about 33 teraFLOPS, actually lower than the H100 released four years ago by 1 teraFLOPS. Although NVIDIA Corporation has introduced FP64 software simulation technology based on the Ozaki scheme, claiming to achieve up to 200 teraFLOPS of matrix performance in the CUDA library, 4.4 times that of hardware performance, AMD has raised doubts about it. AMD researcher Nicholas Malaya pointed out that this simulation method performs decently in some benchmark tests, but its reliability in real physical simulations such as materials science or combustion code is questionable, with issues like insufficient IEEE compliance and double memory consumption.
Ian Cutress, chief analyst at chip consulting firm More Than Moore, pointed out that NVIDIA Corporation's upcoming Rubin chip may have a decline in double-precision performance according to some metrics, which is causing concerns among many scientists in the high-performance computing field.
Startups seizing the opportunity
The strategic adjustment of chip giants is opening up market space for emerging companies like NextSilicon. This Israeli startup, founded in 2017, has completed approximately $303 million in seed and three rounds of funding after eight years of research and development, with a valuation reaching approximately $1.5 billion at one point.
Different from NVIDIA Corporation and AMD's traditional GPU or CPU-based technology routes, NextSilicon's flagship chip "Maverick-2" adopts an intelligent data-flow architecture, capable of dynamically reconstructing and optimizing at runtime through software-defined data-flow hardware, allowing the chip to be reprogrammed in real-time for more efficient operation. In terms of power efficiency, the data-flow architecture reduces the time and energy consumption of data transfer between computational system memory.
James Laros, senior scientist at Sandia National Laboratories responsible for testing new computing architecture projects, highly praised NextSilicon's performance results, showing the true potential to enhance computing capability without a significant amount of code modifications.
On Monday, Sandia National Laboratories, NextSilicon, and Penguin Solutions, which assisted in integrating NextSilicon chips into supercomputers, jointly announced that a supercomputer system with NextSilicon chips has passed a series of key technical milestones in general supercomputing testing, making it eligible for further testing of high-difficulty computational tasks closer to nuclear safety work in the fall of this year.
Laros stated that the laboratory actively partners with small and medium-sized chip companies like NextSilicon with the core goal of building a diversified chip procurement system, ensuring a steady supply of computing chips tailored for research tasks even as the strategic race of top chip companies shifts.
"We must maintain available choices to fulfill our mission, because this mission has no turning back," Laros emphasized.
Related Articles

Revised Implementation Measures for Capacity Replacement in the Steel Industry Released, Accelerating the Reduction of Quantity and Improvement of Quality in the Steel Industry.

Former head of Samsung's chip division: Memory prices will decrease next year.

Anthropic has relaxed the sharing restrictions of its Mythos cybersecurity model: partners can now share vulnerability information with other companies.
Revised Implementation Measures for Capacity Replacement in the Steel Industry Released, Accelerating the Reduction of Quantity and Improvement of Quality in the Steel Industry.

Former head of Samsung's chip division: Memory prices will decrease next year.

Anthropic has relaxed the sharing restrictions of its Mythos cybersecurity model: partners can now share vulnerability information with other companies.






