AI is getting expensive, but relief is on the way - just not for you

Generative AI apps and services face rising costs as infrastructure demand increases for model inference. New GPUs and AI accelerators are being developed to relieve pressure from surging inference demand, but users may not see immediate savings. Code assistants such as Claude Code, Codex, and GitHub Copilot represent major real-world adoption beyond chatbots and image generation. Training infrastructure built for earlier scale is not designed for current inference volumes, since inference and training differ significantly. Hardware vendors are rearchitecting systems to lower cost per token, including Nvidia’s acquisition of Groq and similar efforts across AMD, AWS, Intel, and Google. Cheaper tokens could improve margins and help companies move toward profitability, though new systems are not widely available until early to mid 2027.

"Generative AI apps and services are getting more expensive by the day as model devs grapple with surging infrastructure costs. A new generation of GPUs and AI accelerators promises relief from rising inference demand, but you won't see the savings. After years and billions spent building bigger and better models, the great AI houses are beginning to find tangible use cases for the technology beyond chatbots and image generators."

"But success is a double-edged sword. The bit barns built with borrowed money to train the Sonnets, GPTs, and Geminis at the heart of these apps and services were never meant to serve them at this scale. Inference and training are very different beasts. Those selling the shovels of the AI boom are now racing to bring new hardware better suited to serving these models."

"Cheaper tokens mean better inference economics, higher margins, and the venture capitalists fanning the flames hope that OpenAI, Anthropic, and all the others might actually drag themselves out of the red one day. Your AI addiction is their opportunity. There's just one little problem. All that AI-optimized hardware isn't quite ready yet."

"Much of it is promised for the second half of this year, but it takes time to work out the kinks and ramp supply chains, which means the bulk of these new systems won't have widespread deployments until early to mid 2027. But here lies a fleeting opportunity for the flag-bearers to see how addictive their produc"

#generative-ai #machine-learning-infrastructure #gpu-and-ai-accelerators #inference-economics #cost-per-token

Read at theregister

Unable to calculate read time

Collection

[

...

]

AI is getting expensive, but relief is on the way - just not for youAI is getting expensive, but relief is on the way - just not for you Briefly

AI is getting expensive, but relief is on the way - just not for you
AI is getting expensive, but relief is on the way - just not for you
Briefly