Google introduced open-source tools at Cloud Next, departing from its typical closed-source offerings. MaxDiffusion includes reference implementations of diffusion models for XLA devices, while Jetstream enhances performance for text-generating AI models.
"Jetstream offers up to 3x higher 'performance per dollar' for models like Google's Gemma 7B and Meta's Llama 2, aiming to provide a cost-efficient inference stack for AI workloads. The tool is currently limited to TPUs with GPU compatibility envisioned for the future."
"JetStream helps with cost-efficient AI inference needs, delivering optimizations for popular open models such as Llama 2 and Gemma. However, the claim of '3x improvement' lacks clear context on the comparison baseline and TPU generation used for the calculation."
Collection
[
|
...
]