
On Tuesday, French AI startup Mistral AI released Mistral 2, a 123 billion parameter open-weighted coding model designed to work as part of an autonomous software engineering agent. The model achieved a score of 72.2 percent on SWE-Bench Verified, a benchmark that attempts to test whether AI systems can solve real GitHub issues, placing it among the top performing open-weighted models.
Perhaps more notably, Mistral didn’t just release an AI model, it also released a new development app called Mistral Vibe. It is a command line interface (CLI) similar to Cloud Codes, OpenAI Codex, and Gemini CLI that lets developers interact with Devastral models directly in their terminal. The tool can scan file structures and Git state to maintain context throughout the project, make changes to multiple files, and execute shell commands autonomously. Mistral released the CLI under the Apache 2.0 license.
It’s always wise to take AI benchmarks with a grain of salt, but we’ve heard from employees at big AI companies that they pay a lot of attention to how well models perform on SWE-Bench Verified, which presents AI models with 500 real software engineering problems pulled from GitHub issues in the popular Python repository. The AI must read the problem description, navigate the codebase, and produce a working patch that passes unit tests. While some AI researchers have noted that about 90 percent of the tasks in the benchmark test relatively simple bug fixes that experienced engineers can complete in less than an hour, it is one of the few standardized ways to compare coding models.
As well as the larger AI coding model, Mistral also released Devstral Small 2, a 24 billion parameter version that scores 68 percent on the same benchmarks and can run locally on consumer hardware like a laptop without an Internet connection. Both models support a 256,000 token context window, which allows them to process moderately large codebases (although whether you consider this larger or smaller is very relative depending on the overall project complexity). The company released Devstral 2 under the modified MIT license and Devstral Small 2 under the more accepted Apache 2.0 license.
<a href