
"Perhaps more notably, Mistral didn't just release an AI model, it released a new development app called Mistral Vibe. It's a command line interface (CLI) similar to Claude Code, OpenAI Codex, and Gemini CLI that lets developers interact with the Devstral models directly in their terminal. The tool can scan file structures and Git status to maintain context across an entire project, make changes across multiple files, and execute shell commands autonomously. Mistral released the CLI under the Apache 2.0 license."
"It's always wise to take AI benchmarks with a large grain of salt, but we've heard from employees of the big AI companies that they pay very close attention to how well models do on SWE-bench Verified, which presents AI models with 500 real software engineering problems pulled from GitHub issues in popular Python repositories. The AI must read the issue description, navigate the codebase, and generate a working patch that passes unit tests."
Devstral 2 is a 123 billion-parameter open-weights coding model built to operate as part of an autonomous software engineering agent. The model scores 72.2 percent on SWE-bench Verified, ranking among top-performing open-weights coding models. Mistral also launched Mistral Vibe, a command-line app that provides terminal access to Devstral models, project-wide context via file structure and Git status, multi-file edits, autonomous shell command execution, and is released under the Apache 2.0 license. A smaller Devstral Small 2 has 24 billion parameters, scores 68 percent, runs locally without internet, and both models support a 256,000-token context window. SWE-bench Verified contains 500 real GitHub issues requiring code navigation and passing unit tests, though many tasks are relatively simple bug fixes.
Read at Ars Technica
Unable to calculate read time
Collection
[
|
...
]