Nvidia Will Spend $26 Billion to Build Open-Weight AI Models, Filings Show
Briefly

Nvidia Will Spend $26 Billion to Build Open-Weight AI Models, Filings Show
"The sizable investment could see Nvidia evolve from a chipmaker with an impressive software stack into a bona fide frontier lab capable of competing with OpenAI and DeepSeek. It's a strategic move that could further entrench Nvidia's place as the AI world's leading chip manufacturer, since the models are tuned to the company's hardware."
"Open source models are ones where the weights or the parameters that determine a model's behavior are released publicly-sometimes with the details of its architecture and training. This allows anyone to download and run it on their own machine or the cloud. In Nvidia's case, the company also reveals the technical innovations involved in building and training its models, making it easier for startups and researchers to modify and build upon the company's innovations."
"On Wednesday, Nvidia also released Nemotron 3 Super, its most capable open-weight AI model to date. The new model has 128 billion parameters (a measure of the model's size and complexity), making it roughly equivalent to the largest version of OpenAI's GPT-OSS, though the company claims it outperforms GPT-OSS and other models across several benchmarks."
Nvidia announced a $26 billion investment over five years to develop open source artificial intelligence models, marking a strategic expansion beyond its core chipmaking business. This move enables Nvidia to compete directly with frontier AI labs like OpenAI and DeepSeek. Open source models release their weights and parameters publicly, allowing anyone to download and run them independently. Nvidia's approach includes revealing technical innovations, enabling startups and researchers to build upon its work. The company released Nemotron 3 Super, its most capable open-weight model with 128 billion parameters, claiming superior performance compared to GPT-OSS across multiple benchmarks. Nvidia employed advanced architectural and training techniques to enhance reasoning abilities and long-context handling in its models.
Read at WIRED
Unable to calculate read time
[
|
]