
AI demand is increasing faster than datacenter inference capacity, driven by flat-rate subscriptions and long-running AI agent workloads. These workloads require substantial computing power and often exceed what OpenAI and its cloud partners can provide. Planned datacenter construction is unlikely to close the gap soon. The shortage leads to rationing through usage limits, variable pricing, subscription restrictions, and token consumption arbitrage by switching to cheaper models. OpenAI responds with OpenAI Guaranteed Capacity, which provides eligible customers a framework to align forecasted demand, commercial commitments, and guaranteed shared capacity across supported cloud providers. Customers commit to annual spending for one to three years to receive discounts based on duration.
"AI is in short supply as demand - stoked by flat-rate subscriptions - races ahead of datacenter inference capacity. AI workloads require a lot of computing power, especially when they run for hours on end, a common scenario for AI agents. They require more computing power than OpenAI and its cloud partners can provide."
"The result has been thinly disguised rationing through usage limits, variable pricing, subscription restrictions, and token consumption arbitrage - swapping out expensive models for cheaper ones. OpenAI's answer is a new offering meant to ensure it can deliver on its AI promises."
""Today we announced OpenAI Guaranteed Capacity, a new offering that helps eligible customers plan for reliable access to OpenAI compute across supported cloud providers as they scale critical workflows," said Sachin Katti, who runs compute for OpenAI, in a LinkedIn post. "It gives customers a clearer framework to align forecasted demand, commercial commitments, and guaranteed shared capacity over time.""
"OpenAI says customers can make annual spending commitments ranging from one to three years, with discounts that scale according to duration. "Guaranteed Capacity includes certainty of access to compute""
#ai-compute-capacity #machine-learning-inference #cloud-services #pricing-and-subscriptions #data-center-infrastructure
Read at theregister
Unable to calculate read time
Collection
[
|
...
]