DeepSeek delayed by GPU export restrictions
Briefly

DeepSeek is experiencing a delay in advancing its upcoming R2 model caused by a lack of access to Nvidia GPUs. Despite months of development, CEO Liang Wengfeng remains dissatisfied with the project's progress. The company made headlines earlier in the year with its R1 reasoning model, which proved capable against elite Western models yet was trained on a substantial number of GPUs. Compounding the challenge, new U.S. sanctions restrict the export of certain GPUs to China, limiting DeepSeek's options as many available units are utilized by its clients.
DeepSeek has reportedly stalled in the development of its future R2 model because the company does not have access to sufficient GPUs from Nvidia, according to a report.
According to The Information, DeepSeek trained R1 on a cluster of 50,000 Hopper GPUs, including approximately 10,000 H100s, but they faced issues obtaining sufficient H20 GPUs.
Read at Techzine Global
[
|
]