DeepSeek's R1 model training costs pour cold water on big tech's massive AI spending

"In mid-2024, Anthropic CEO Dario Amodei projected AI training costs to soar to such an extent that building a new model could cost upwards of $100 billion. Amodei's lofty claims appear to have been completely shot down with the publication of a new research paper from DeepSeek. In a recent , published in the academic journal Nature, the Chinese AI developer claims it spent a paltry amount training its flagship R1 model."

"All told, training costs amounted to $294,000, with the company using 512 Nvidia H800 chips to build the model that had US companies sweating earlier this year. It's worth noting that these costs come in addition to around $6 million spent by the firm to create the base LLM R1 is built on. Regardless, the results are impressive given the far higher training costs associated with competing models."

DeepSeek reported spending $294,000 to train its flagship R1 model using 512 Nvidia H800 chips and about $6 million to develop the base LLM that R1 builds on. R1 is a reasoning model optimized for mathematics and coding and is available as an open-weight model. R1 has been downloaded over 10 million times on Hugging Face. The model was trained on real-world data and used reinforcement learning that rewarded correct answers to reduce training costs. The reported expense is far lower than the high training costs commonly associated with competing large models.

#training-costs #deepseek-r1 #reinforcement-learning #open-weight-models

Read at IT Pro

Unable to calculate read time

Collection

[

...

]

DeepSeek's R1 model training costs pour cold water on big tech's massive AI spendingDeepSeek's R1 model training costs pour cold water on big tech's massive AI spending Briefly

DeepSeek's R1 model training costs pour cold water on big tech's massive AI spending
DeepSeek's R1 model training costs pour cold water on big tech's massive AI spending
Briefly