The Chinese finance whizz whose DeepSeek AI model stunned the world

"R1 is a 'reasoning' large language model (LLM) that excels at solving complex tasks - such as in mathematics and coding - by breaking them down into steps. It was the first of its kind to be released as open weight, meaning that the model can be downloaded and built on for free, so has been a boon for researchers who want to adapt algorithms to their own field."

"Despite R1 having many capabilities that are on a par with the best US models, including those powering ChatGPT, its training costs were much less than those of rival companies, say AI experts. Training costs for Meta's Llama 3 405B model, for example, were more than ten times greater. DeepSeek's bid for transparency extended to publishing the details of how it built and trained R1"

DeepSeek released the R1 model in January, a reasoning large language model that breaks complex tasks into stepwise solutions. Liang Wenfeng founded DeepSeek after earning substantial returns applying AI to finance and established the company in Hangzhou in 2023. R1 was released with open weights, allowing free download and adaptation, and has aided researchers across fields. R1 matches many capabilities of top US models while incurring far lower training costs. DeepSeek published detailed training information and submitted R1 to peer review, and the approach prompted other firms in China and the United States to release their own open models.

#open-weight-llm #reasoning-ai #training-cost-efficiency #ai-research-democratization

Read at Nature

Unable to calculate read time

Collection

[

...

]

The Chinese finance whizz whose DeepSeek AI model stunned the worldThe Chinese finance whizz whose DeepSeek AI model stunned the world Briefly

The Chinese finance whizz whose DeepSeek AI model stunned the world
The Chinese finance whizz whose DeepSeek AI model stunned the world
Briefly