IBM attributes those improved characteristics vs. larger models to its hybrid architecture that combines a small amount of standard transformer-style attention layers with a majority of Mamba layers-more specifically, Mamba-2. With 9 Mamba blocks per 1 Transformer block, Granite gets linear scaling vs. context length for the Mamba parts (vs. quadratic scaling in transformers), plus local contextual dependencies from transformer attention (important for in-context learning or few-shots prompting).
The acquisition should help organizations build, train, and deploy specific AI models and Small Language Models (SLMs) within their own infrastructure. NeuralFabric's technology should primarily ensure that the new AI Canvas has an even more solid foundation. The future of AI models lies at least as much in small models as in large ones. To make AI truly interesting within organizations, we don't need another generic model, but rather more specialized models and SLMs.
Andrej Karpathy, a former OpenAI researcher and Tesla's former director of AI, calls his latest project the "best ChatGPT $100 can buy." Called "nanochat," the open-source project, released yesterday for his AI education startup EurekaAI, shows how anyone with a single GPU server and about $100 can build their own mini-ChatGPT that can answer simple questions and write stories and poems.
During the Snapdragon Summit on Maui, Cristiano Amon, CEO of Qualcomm, gave a glimpse into where the (mobile) ecosystem they provide with chips is heading. Qualcomm envisions a future in which AI moves from the cloud to your devices, taking care of everything for you in every possible way. Qualcomm invited us to attend the Snapdragon Summit, where two new chips were presented: a new smartphone and a new compute chip. The latter is primarily intended for laptops and mini PCs.