
Google reported a major expansion in token processing capacity to meet internal and external AI inference demand. Token handling rose from 9.7 trillion tokens per month two years ago to 480 trillion last year, and to 3.2 quadrillion tokens per month currently. Google said 8.5 million developers build applications using the Gemini model family monthly, using about 19 billion tokens per minute through API calls. More than 375 customers consumed over 1 trillion tokens each over the past 12 months, indicating business demand. Google attributed this scale to large capital expenditures in datacenters, compute capacity, and TPU hardware, with annual capex rising from $31 billion in 2022 to an expected $180–190 billion this year. Demis Hassabis described Gemini Omni as a step toward AGI.
"“Now some out there might call this tokenmaxxing and there's probably some truth to it,” said Pichai. “I still think it tells an important story about our products and how others are building as well, especially our developers.”"
"Pichai said over 8.5 million developers are building applications using Google's Gemini model family monthly, using about 19 billion tokens per minute in API calls. And over the past 12 months, more than 375 customers have consumed more than 1 trillion tokens each - an indication there's some demand for AI among businesses."
"That token processing is possible because of the vast capital expenditures Google has made in datacenters and compute capacity, and TPU hardware. “Supporting all of this at scale for our users while also serving enterprises and developers around the world requires massive investments in infrastructure,” said Pichai."
"“And we've been investing for today and for the future. In 2022, we were spending $31 billion annually in capex. This year, we expect that number to be about six times that, approximately 180 to 190 billion dollars.”"
Read at theregister
Unable to calculate read time
Collection
[
|
...
]