OpenAI swaps Nvidia for Cerebras with GPT-5.3-Codex-Spark
GPT-5.3-Codex-Spark is a Cerebras-optimized, low-latency encoding model generating over 1,000 tokens/sec to enable immediate, minimal, real-time developer code adjustments.
A new version of OpenAI's Codex is powered by a new dedicated chip | TechCrunch
OpenAI released GPT-5.3-Codex-Spark, a lightweight, low-latency Codex model using Cerebras WSE-3 hardware to accelerate inference and enable real-time collaboration and rapid prototyping.
OpenAI signs deal, reportedly worth $10 billion, for compute from Cerebras | TechCrunch
OpenAI signed a multi-year agreement with Cerebras for 750 megawatts of compute through 2028 to accelerate low-latency inference and speed customer-facing responses.
Front-loading AI processing to devices or the edge enables faster, lower-latency imaging, reduced bandwidth, and improved privacy for real-time analytics and alerts.
Nvidia's Jetson Thor provides local, low-latency robotic computing enabling edge AI reasoning for physical robots, but implementation depends on partners.