The economics of CPU-based AI aren't great
Briefly

In tests with 4th-Gen Intel Xeon processors, Google found that CPUs can efficiently handle GenAI workloads, achieving acceptably low latencies for large language models.
Google managed a time per output token of 55 milliseconds for a 7B parameter model using C3 VMs, demonstrating that CPUs can support significant AI model complexities.
The benchmarks revealed that fine-tuning the RoBERTa model on the C3 instances completed in under 25 minutes, showing CPU capabilities in handling advanced tasks.
Though significant results were shown, Google primarily aimed to highlight the acceleration benefits of AMX over older CPU generations, not to compete with GPUs.
Read at Theregister
[
|
]