MiniMax's new open M2.5 and M2.5 Lightning near state-of-the-art while costing 1/20th of Claude Opus 4.6


Chinese AI startup Minimax, headquartered in Shanghai, has shocked the AI ​​industry today by releasing its new M2.5 language model in two variants, which promises to make high-end artificial intelligence so affordable that you can stop worrying about the bill altogether.

it is also said that "open source," However the weights (settings) and code have not yet been posted, nor have the exact license type or terms. But that’s almost beside the point, given how affordable the service MiniMax is offering through its API and partners.

For the past few years, using the world’s most powerful AI was like hiring an expensive consultant – it was fantastic, but you were constantly watching the clock (and counting tokens). M2.5 changes that math, Up to 95% reduction in range cost.

By delivering performance that rivals Google and Anthropic’s top-tier models at a fraction of the cost, especially in the use of agentic tools for enterprise tasks, including Creating Microsoft Word, Excel and PowerPoint FilesMinimax is betting that the future is not just about how smart a model is, but also how often you can use it.

In fact, in this regard, Minimax says it worked "With senior professionals in fields such as finance, law and social sciences" To ensure that the model can actually perform as per its specifications and standards.

This release matters because it signals a shift from AI "chatbot" AI as a "worker". when intelligence becomes "It is very cheap to install a meter." Developers stop making simple Q&A tools and start building "agent"—Software that can autonomously spend hours coding, researching, and organizing complex projects without breaking the bank.

In fact, Minimax has already deployed this model in its operations. currently, 30% of all Minimax HQ tasks are accomplished by M2.5and a shocking 80% of their newly committed code is generated by M2.5!

As the Minimax team writes in their release blog post, "We believe that M2.5 offers virtually unlimited possibilities for the development and operation of agents in the economy."

Technology: Sparse power and the success of CISPO

The secret of M2.5’s efficiency lies in its Mix of Experts (MoE) architecture. Instead of running all 230 billion of its parameters for every single word it generates, the model simply "activates" 10 Billion. This allows it to maintain the reasoning depth of a huge model while running with the agility of a much smaller model.

To train this complex system, MiniMax developed a proprietary reinforcement learning (RL) framework called Forge. MiniMax engineer Olive Song said in a Thursday podcast on YouTube that the technique was helpful in increasing performance even when using a relatively small number of parameters, and the model was trained over a two-month period.

Forge is designed to help models learn "real world environment" – Essentially giving AI practice coding and using tools in thousands of simulated workspaces.

"We realized that such a small model has a lot of potential if we train reinforcement learning on it with a large amount of environments and agents," Geet said. "But this is not a very easy task to do," This is what they spent together "too much time" But.

To keep the model stable during this intensive training, he used a mathematical approach called CISPO (Clipping Importance Sampling Policy Optimization) and shared the formula on his blog.

This formula ensures that the model does not get overcorrected during training, allowing it to develop something called a minimax. "architect mindset". Instead of writing code directly, M2.5 learned to proactively plan the project’s structure, features, and interface first.

State-of-the-art (and near) benchmarks

The results of this architecture are reflected in the latest industry leaderboards. M2.5 is not yet improved; This has reached the top tier of coding models, coming close to Anthropic’s latest model, Cloud Opus 4.6, released just a week ago, and shows that Chinese companies are now just days away from catching up to US labs with superior resources (in terms of GPUs).

Here are some of the new MiniMax M2.5 benchmark highlights:

  • SWE-Bench Verified: 80.2% – Matches Cloud Opus 4.6 speed

  • BrowseComp: 76.3% – Industry-leading search and tool usage.

  • Multi-SWE-Bench: 51.3% – SOTA in multi-language coding

  • BFCL (Tool Calling): 76.8% – High precision agentive workflows.

On the Thursday Podcast, host Alex Volkov explained that MiniMax M2.5 operates much faster and therefore uses fewer tokens to complete tasks, on the order of $0.15 per task compared to $3.00 for Cloud Opus 4.6.

breaking the cost barrier

Minimax is offering two versions of the model through its API, both focused on high volume production use:

  • M2.5-Lightning: Optimized for speed, distributing 100 tokens per second. It costs $0.30 per 1M input tokens and $2.40 per 1M output tokens.

  • Standard M2.5: Optimized for cost, running at 50 tokens per second. It costs half that of the Lightning version ($0.15 per 1M input tokens / $1.20 per 1M output tokens).

In simple language: Minimax claims you can run four "agent" (AI worker) continuously for a full year for about $10,000.

For enterprise users, this price is about 1/10th to 1/20th the cost of competing proprietary models like GPT-5 or Cloud 4.6 Opus.

Sample

input

Production

total cost

Source

quen 3 turbo

$0.05

$0.20

$0.25

alibaba cloud

DeepSeek-Chat (V3.2-Exp)

$0.28

$0.42

$0.70

deepseek

DeepSeek-Reasoner (V3.2-Exp)

$0.28

$0.42

$0.70

deepseek

grok 4.1 fast (logic)

$0.20

$0.50

$0.70

xai

grok 4.1 fast (non-argument)

$0.20

$0.50

$0.70

xai

minimax m2.5

$0.15

$1.20

$1.35

minimal maximum

Minimax M2.5-Lightning

$0.30

$2.40

$2.70

minimal maximum

gemini 3 flash preview

$0.50

$3.00

$3.50

Google

km-k2.5

$0.60

$3.00

$3.60

moon

GLM-5

$1.00

$3.20

$4.20

Z.ai

Ernie 5.0

$0.85

$3.40

$4.25

Baidu

cloud haiku 4.5

$1.00

$5.00

$6.00

anthropic

quen3-max (2026-01-23)

$1.20

$6.00

$7.20

alibaba cloud

Gemini 3 Pro (≤200K)

$2.00

$12.00

$14.00

Google

GPT-5.2

$1.75

$14.00

$15.75

OpenAI

cloud sonnet 4.5

$3.00

$15.00

$18.00

anthropic

Gemini 3 Pro (>200K)

$4.00

$18.00

$22.00

Google

cloud opus 4.6

$5.00

$25.00

$30.00

anthropic

GPT-5.2 Pro

$21.00

$168.00

$189.00

OpenAI

Strategic implications for enterprises and leaders

For technology leaders, M2.5 represents much more than a cheap API. This changes the operational playbook for enterprises right now.

pressure to "Adaptation" The sign to save money is gone. You can now deploy high-context, high-logic models for routine tasks that were previously cost-prohibitive.

This means a 37% speed improvement in end-to-end task completion "agentic" Pipelines valued by AI orchestrators – where models talk to other models – eventually move fast enough for real-time user applications.

Furthermore, the M2.5’s high score in financial modeling (74.4% on MEWC) suggests it can handle it. "tacit knowledge" Of specific industries like law and finance with minimal oversight.

Because M2.5 is positioned as an open-source model, organizations can potentially run in-depth, automated code audits at a scale that was previously impossible without large-scale human intervention, while maintaining better control over data privacy, but until licensing rules and weights are posted, this remains just an alias.

MiniMax M2.5 is a sign that the limits of AI are no longer just about who can build the biggest brain, but who can make that brain the most useful—and cost-effective—worker in the room.



<a href

Leave a Comment