China's DeepSeek AI Model Shocks the World: Should You Sell Your Nvidia Stock?

In This Article:

Could Nvidia's (NASDAQ: NVDA) magical two-year run be coming to an end? Up until now, there has been insatiable demand for Nvidia's latest and greatest graphics processing units (GPUs). As the artificial intelligence races heated up, big tech companies and start-ups alike rushed to buy or rent as many of Nvidia's high-performance GPUs as they could in a bid to create better and better models.

But last week, Chinese AI start-up DeepSeek released its R1 model that stunned the technology world. R1 is a "reasoning" model that has matched or exceeded OpenAI's o1 reasoning model, which was just released at the beginning of December, for a fraction of the cost.

Being able to generate leading-edge large language models (LLMs) with limited computing resources could mean that AI companies might not need to buy or rent as much high-cost compute resources in the future. The consequences could be devastating for Nvidia and last year's AI winners alike.

But as always, the truth is more complicated.

What is DeepSeek?

DeepSeek is an AI lab spun out of a quantitative hedge fund called High-Flyer. CEO Liang Wenfeng founded High-Flyer in 2015 and began the DeepSeek venture in 2023 after the earth-shaking debut of ChatGPT.

DeepSeek has been building AI models ever since, reportedly purchasing 10,000 Nvidia A100s before they were restricted, which are two generations prior to the current Blackwell chip. DeepSeek also reportedly has a cluster of Nvidia H800s, which is a capped, or slowed, version of the Nvidia H100 designed for the Chinese market. Of note, the H100 is the latest generation of Nvidia GPUs prior to the recent launch of Blackwell.

R1 shocks the world

On Jan. 20, DeepSeek released R1, its first "reasoning" model based on its V3 LLM. Reasoning models are relatively new, and use a technique called reinforcement learning, which essentially pushes an LLM to go down a chain of thought, then reverse if it runs into a "wall," before exploring various alternative approaches before getting to a final answer. Reasoning models can therefore answer complex questions with more precision than straight question-and-answer models can't.

Incredibly, R1 has been able to meet or even exceed OpenAI's o1 on several benchmarks, while reportedly trained at a small fraction of the cost.

Just how cheap are we talking about? The R1 paper claims the model was trained on the equivalent of just $5.6 million rented GPU hours, which is a small fraction of the hundreds of millions reportedly spent by OpenAI and other U.S.-based leaders. DeepSeek is also charging about one-thirtieth of the price it costs OpenAI's o1 to run, while Wenfeng maintains DeepSeek charges for a "small profit" above costs. Experts have estimated that Meta Platforms' (NASDAQ: META) Llama 3.1 405B model cost about $60 million of rented GPU hours to run, compared with the $6 million or so for V3, even as V3 outperformed Llama's latest model on a variety of benchmarks.