Google intros EmbeddingGemma for on-device AI
Briefly

Google intros EmbeddingGemma for on-device AI
"With the introduction of its EmbeddingGemma, Google is providing a multilingual text embedding model designed to run directly on mobile phones, laptops, and other edge devices for mobile-first generative AI. Unveiled September 4, EmbeddingGemma features a 308 million parameter design that enables developers to build applications using techniques such as RAG ( retrieval-augmented generation) and semantic search that will run directly on the targeted hardware, Google explained."
"Based on the Gemma 3 lightweight model architecture, EmbeddingGemma is trained on more than 100 languages and is small enough to run on fewer than 200MB of RAM with quantization. Customizable output dimensions are featured, ranging from 768 dimensions to 128 dimensions via Matryoshka representation and a 2K token context window. EmbeddingGemma empowers developers to build on-device, flexible, privacy-centric applications, according to Google."
EmbeddingGemma is a 308 million parameter, multilingual text embedding model designed to run directly on mobile phones, laptops, and other edge devices for mobile-first generative AI. The model is based on the Gemma 3 lightweight architecture and is trained on more than 100 languages. With quantization, EmbeddingGemma can run on fewer than 200MB of RAM. It offers customizable output dimensions from 768 down to 128 via Matryoshka representation and a 2K token context window. Model weights are available on Hugging Face, Kaggle, and Vertex AI. EmbeddingGemma supports on-device RAG, semantic search, and integrates with many tools and frameworks.
Read at InfoWorld
Unable to calculate read time
[
|
]