IBM attributes those improved characteristics vs. larger models to its hybrid architecture that combines a small amount of standard transformer-style attention layers with a majority of Mamba layers-more specifically, Mamba-2. With 9 Mamba blocks per 1 Transformer block, Granite gets linear scaling vs. context length for the Mamba parts (vs. quadratic scaling in transformers), plus local contextual dependencies from transformer attention (important for in-context learning or few-shots prompting).
The acquisition should help organizations build, train, and deploy specific AI models and Small Language Models (SLMs) within their own infrastructure. NeuralFabric's technology should primarily ensure that the new AI Canvas has an even more solid foundation. The future of AI models lies at least as much in small models as in large ones. To make AI truly interesting within organizations, we don't need another generic model, but rather more specialized models and SLMs.
Andrej Karpathy, a former OpenAI researcher and Tesla's former director of AI, calls his latest project the "best ChatGPT $100 can buy." Called "nanochat," the open-source project, released yesterday for his AI education startup EurekaAI, shows how anyone with a single GPU server and about $100 can build their own mini-ChatGPT that can answer simple questions and write stories and poems.