Nvidia unveils new GPU designed for long-context inference

"At the AI Infrastructure Summit on Tuesday, Nvidia announced a new GPU called the Rubin CPX, designed for context windows larger than 1 million tokens."

"Part of the chip giant's forthcoming Rubin series, the CPX is optimized for processing large sequences of context and is meant to be used as part of a broader "disaggregated inference" infrastructure approach."

"For users, the result will be better performance on long-context tasks like video generation or software development."

"Nvidia's relentless development cycle has resulted in enormous profits for the company, which brought in $41.1 billion in data center sales in its most recent quarter."

"The Rubin CPX is slated to be available at the end of 2026."

Nvidia introduced the Rubin CPX GPU at the AI Infrastructure Summit, engineered to handle context windows larger than one million tokens. The CPX belongs to the upcoming Rubin series and focuses on processing very large sequences of context. The GPU is intended to operate within a disaggregated inference infrastructure to improve scalability and efficiency. End users can expect improved performance on long-context applications such as video generation and software development. Nvidia reported $41.1 billion in data center sales in its latest quarter. The Rubin CPX is planned for availability at the end of 2026.

#nvidia #rubin-cpx #large-context-llms #gpu-hardware #ai-infrastructure

Read at TechCrunch

Unable to calculate read time

Collection

[

...

]

Nvidia unveils new GPU designed for long-context inference | TechCrunchNvidia unveils new GPU designed for long-context inference | TechCrunch Briefly

Nvidia unveils new GPU designed for long-context inference | TechCrunch
Nvidia unveils new GPU designed for long-context inference | TechCrunch
Briefly