#mixture-of-experts

[ follow ]
Artificial intelligence
fromInfoQ
1 week ago

xAI Releases Grok Code Fast 1, a New Model for Agentic Coding

grok-code-fast-1 is an agentic coding model optimized for tool usage, high throughput, long context, and seamless integration with developer workflows.
Artificial intelligence
fromMedium
1 week ago

Microsoft AI Unveils MAI-Voice-1 and MAI-1-Preview to Power the Next Generation of AI

Microsoft released MAI-Voice-1 and MAI-1-preview to deliver high-speed expressive speech and an internally built foundation model for improved instruction following and text responses.
fromInfoWorld
1 week ago

Microsoft signals shift from OpenAI with launch of first in-house AI models for Copilot

According to Microsoft, MAI-1-preview uses an in-house mixture-of-experts model that was pre-trained and post-trained on 15,000 Nvidia H100 GPUs, a more modest infrastructure than the 100,000 H100 cluster sizes reportedly used for model development by some rivals. However, with an eye to ramping up performance, Microsoft AI is now running MAI-1-preview on Nvidia's more powerful GB200 cluster, the company said.
Artificial intelligence
Scala
fromHackernoon
7 months ago

SUTRA: Decoupling Concept & Language for Multilingual LLM Excellence | HackerNoon

SUTRA is a multilingual LLM that excels in understanding and generating text efficiently across 50+ languages.
fromThegreenplace
4 months ago

Sparsely-gated Mixture Of Experts (MoE)

The feed forward layer in transformer models is crucial for reasoning on token relationships, often housing most of the model's weights due to its larger dimensionality.
Marketing tech
[ Load more ]