Why you'll pay more for AI in 2026, and 3 money-saving tips to try
Briefly

Why you'll pay more for AI in 2026, and 3 money-saving tips to try
"We're living in a token economy. Each piece of content -- words, images, sounds, etc. -- is treated by an AI model as an atomic unit of work called a token. When you type into a prompt in ChatGPT, and you receive a paragraph in response, or you call an API to do the same thing inside an app you've built, both the input and the output data are counted as tokens. As a result, the meter is always running when you use AI, racking up costs per token, and the total bill is set to go higher in aggregate."
"The most immediate reason for rising prices is the increasing cost -- incurred by OpenAI, Google, Anthropic, and other operators of AI services -- of building and running AI's underlying infrastructure. As their costs go higher, so must the price of AI. The highest cost is the DRAM memory chips used to ingest input tokens. To hold the tokens in memory and store them for later use requires an increasing amount of DRAM."
Token usage determines AI costs because both input and output are counted as tokens. DRAM and HBM memory chips represent the largest infrastructure expense for AI operators. A supply crunch and rising demand for DRAM and HBM are pushing chip prices higher year over year. More verbose chatbots increase token usage and therefore billing per interaction. Providers must raise prices as infrastructure costs rise. Operators and developers are responding by building more efficient models and optimizing memory usage. Users can mitigate personal costs by prioritizing projects, reducing verbosity, and using polite prompting to limit unnecessary token consumption.
Read at ZDNET
Unable to calculate read time
[
|
]