Open-weight models

MiniMax M3: the open-weight frontier, with an asterisk on 'open'

A genuinely ambitious model — frontier coding, 1M context, multimodal — shipped before its weights did. Here is what that means.

WireRead Editorial1 June 2026Verified June 2026

MiniMax Open-weight models Model launches

The answer

MiniMax M3, launched 1 June 2026, is an open-weight frontier coding model with 1M context.

MiniMax's pitch for M3 is the kind of sentence that stops a scroll: the first open-weight model to put frontier-grade coding, a million-token context window and native image-and-video understanding into a single architecture, at a fraction of the price of the closed frontier. If it holds up, it compresses years of US lab investment into something that anyone can, in principle, run themselves. The catch — and it is a genuine one — is that on 1 June 2026, the day it launched, you couldn't.

The architecture and what it claims to do

The technical spine is MiniMax Sparse Attention (MSA), the company's proprietary attention mechanism, which MiniMax says delivers meaningful speed-ups for both prefill and decoding at very long contexts. That matters practically: a 1M-token context window is only useful if retrieving from it doesn't cost you three seconds of wall-clock latency per call. On agentic workloads — where a coding agent is reading an entire repository and writing back patches — that efficiency gap is often the difference between a tool people actually use and one they benchmark and shelve.

The claimed multimodality covers images and video natively, not via a bolt-on vision encoder after the fact, which is MiniMax's stated design choice — architecture, not a feature flag.

On coding specifically, MiniMax reports 59.0% on SWE-Bench Pro, a benchmark that asks models to write real patches for real GitHub issues. That figure, if it holds, would place M3 ahead of both GPT-5.5 and Gemini 3.1 Pro on that test. But here is the first asterisk: the number is MiniMax's own, run on MiniMax's infrastructure, on a model whose weights the public did not have. SWE-Bench Pro results are not reproducible without the weights, so the figure is a claim, not a finding — a distinction that matters when you're building a system atop it.

MiniMax M3 is billed as the first open-weight model to combine frontier coding, a 1M-token context window and native multimodality — though the weights were not published at launch and the headline benchmarks are vendor-reported.

Source: Tech Times · 1 June 2026

The open-weight IOU

The bigger issue is structural. On 1 June, MiniMax offered the model through its API and on OpenRouter. What it did not offer was the weights — the files you'd need to run the model yourself, audit its behaviour, fine-tune it for your use case, or reproduce the benchmark numbers. Those were promised on Hugging Face within about ten days of launch.

'Open-weight' has a clear community meaning: the weights are available, you can self-host, and you can verify the lab's claims for yourself. On launch day, M3 didn't meet that bar. The honest framing is that MiniMax gave a dated public commitment rather than an open release. That is a different thing, and conflating the two sets a bad precedent — both for builders making stack decisions and for the broader ecosystem's trust in the open-source label.

Price: the part that actually changes builder behaviour

Set aside the open-weight question for a moment, and the pricing story is real regardless. The launch listing on OpenRouter placed M3 at roughly $0.30 input / $1.20 output per million tokens — a promotional rate, but one that undercuts the closed frontier by close to an order of magnitude. For context:

Model	Input ($/M tokens)	Output ($/M tokens)
MiniMax M3 (promo)	~$0.30	~$1.20
GPT-5.5	~$2.50–$5.00	~$10.00–$15.00
Claude Opus 4.x	~$3.00–$5.00	~$15.00
Gemini 3.1 Pro	~$1.25	~$5.00

The figures above use approximate published rates at the time of M3's launch; exact pricing varies by tier and usage. The point is the magnitude: if M3's coding performance holds at independent testing, a team running 100M tokens a month sees the bill fall from several thousand dollars to a few hundred. That doesn't just save money; it removes the incentive to gate certain requests or optimise aggressively for token count — a productivity change as much as a cost one.

The capability gap worth keeping in frame

There is a number in the M3 release that the press release did not headline, and it matters: M3 reportedly scores under 12% on ARC-AGI-2, the abstract-reasoning benchmark where Western frontier models still lead. That is not unusual for Chinese frontier models — Qwen 3.x and DeepSeek V4 Pro show a similar profile — and it is not necessarily disqualifying for coding work, which is more about pattern application and structured generation than novel abstract reasoning. But it is a real constraint, and a user planning to deploy M3 for open-ended research synthesis or complex multi-step planning (rather than pure code tasks) should weight it accordingly.

Read together, M3's profile is: strong on coding and long-context multimodal retrieval, competitive on cost, behind on raw reasoning — a coherent specialisation, not a universal crown. MiniMax's Hong Kong shares appeared to process the same ambiguity on the day: the stock reportedly swung up around 5% before closing sharply lower, which is market shorthand for 'exciting but unresolved'.

MiniMax M3 launches with frontier coding claims and a 1M context window built on MiniMax Sparse Attention — offering a low-cost API alternative while weights remain pending on Hugging Face.

Source: Apidog · 2 June 2026

What to watch and what to do now

The ten-day weight window is the first gate. If the weights land on schedule, the benchmark conversation immediately changes — independent engineers can reproduce the SWE-Bench Pro run, and the open-source ecosystem can begin fine-tuning and deployment work in earnest. If the weights slip, the open-weight marketing claim will take a credibility hit that will be hard to walk back.

For builders, the calculus is: the API is live, the pricing is real and disruptive, and the risk of building on it before the weights land is modest if your workload is standard coding or document retrieval. The risk is higher if you need to audit the model, customise weights, or stake a compliance argument on self-hosting. The sensible move is to test it on the API now and gate any production commitment on the weight release and independent benchmark confirmation.

Frequently asked questions

Can I download and run MiniMax M3 myself?

Not at launch — on 1 June 2026 only the API was live, with MiniMax saying the weights would reach Hugging Face within about ten days. Until then the 'open-weight' label is a stated commitment rather than something you could verify by self-hosting or reproducing the benchmark numbers.

Is MiniMax M3 really better than GPT-5.5 at coding?

MiniMax reports 59.0% on SWE-Bench Pro, ahead of GPT-5.5 and Gemini 3.1 Pro — but that figure is vendor-run on a model whose weights weren't out, so it was not independently verifiable at launch. It's a strong claim pending confirmation.

What is MiniMax Sparse Attention and why does it matter?

It's MiniMax's proprietary architecture for handling very long context windows at speed — the company says it improves prefill and decoding performance at the 1M-token scale that would otherwise make long-context retrieval impractically slow for real agentic work.

Why did MiniMax's stock price swing on launch day?

MiniMax's Hong Kong shares reportedly rose about 5% before closing sharply lower the same day — a pattern that reflects the market simultaneously pricing in the frontier ambition and the unresolved questions about weights and benchmark verification.

What is MiniMax M3 weak at?

Abstract reasoning: M3 reportedly scores under 12% on ARC-AGI-2, per MiniMax's own data. That's a real gap versus US frontier models and limits M3's utility for open-ended research or complex multi-step planning, even if coding and long-context retrieval are strong.

Sources

MiniMax M3 Open-Weight Coding Model: Frontier Claims, Unverified Benchmarks — Tech Times, 1 June 2026
MiniMax launches M3, an open-weight frontier model with 1M context — DataNorth, 1 June 2026
What Is MiniMax M3? The First Open-Weight Frontier Coding Model — Apidog, 2 June 2026

← All news

The architecture and what it claims to do

The claimed multimodality covers images and video natively, not via a bolt-on vision encoder after the fact, which is MiniMax's stated design choice — architecture, not a feature flag.

Source: Tech Times · 1 June 2026

The open-weight IOU

Price: the part that actually changes builder behaviour

Model	Input ($/M tokens)	Output ($/M tokens)
MiniMax M3 (promo)	~$0.30	~$1.20
GPT-5.5	~$2.50–$5.00	~$10.00–$15.00
Claude Opus 4.x	~$3.00–$5.00	~$15.00
Gemini 3.1 Pro	~$1.25	~$5.00

The capability gap worth keeping in frame

MiniMax M3 launches with frontier coding claims and a 1M context window built on MiniMax Sparse Attention — offering a low-cost API alternative while weights remain pending on Hugging Face.

Source: Apidog · 2 June 2026

What to watch and what to do now

Frequently asked questions

Can I download and run MiniMax M3 myself?

Is MiniMax M3 really better than GPT-5.5 at coding?

What is MiniMax Sparse Attention and why does it matter?

Why did MiniMax's stock price swing on launch day?

What is MiniMax M3 weak at?

MiniMax M3: the open-weight frontier, with an asterisk on 'open'

The architecture and what it claims to do

The open-weight IOU

Price: the part that actually changes builder behaviour

The capability gap worth keeping in frame

What to watch and what to do now

Frequently asked questions

Sources

Related

The 2026 open-weight surge, explained

DeepSeek V4: what the architecture shift actually signals

Qwen and the rise of open-source AI from China

MiniMax M3: the open-weight frontier, with an asterisk on 'open'

The architecture and what it claims to do

The open-weight IOU

Price: the part that actually changes builder behaviour

The capability gap worth keeping in frame

What to watch and what to do now

Frequently asked questions

Sources

Related

The 2026 open-weight surge, explained

DeepSeek V4: what the architecture shift actually signals

Qwen and the rise of open-source AI from China