#mixture-of-experts-moe-models

[ follow ]
InfoQ
8 months ago
Artificial intelligence

Mistral AI's Open-Source Mixtral 8x7B Outperforms GPT-3.5

Mistral AI released Mixtral 8x7B, a large language model (LLM) that outperforms other models on benchmarks.
Mistral 8x7B is a sparse mixture of experts (SMoE) model with 46.7B parameters, but performs at the same speed and cost as smaller models. [ more ]
[ Load more ]