Gemini 3.5 Flash ships at I/O — and the Pro that didn't is the more telling story
Google made the cheaper, faster model its default everywhere. The flagship it announced is still not out.
The answer
Google launched Gemini 3.5 Flash at I/O 2026; Gemini 3.5 Pro was announced, not released.
At Google I/O on 19 May 2026, Google did two things in a single announcement: it launched Gemini 3.5 Flash into general availability across every major Google surface, and it announced Gemini 3.5 Pro without releasing it. The blog title — 'Gemini 3.5: frontier intelligence with action' — frames both as one story. They are not. Reading them as separate signals gives a much cleaner picture of what Google is actually doing with its model strategy.
What shipped: Flash, everywhere, on day one
Gemini 3.5 Flash went live as the default model for the consumer Gemini app and for AI Mode in Search (the live AI-powered answer layer in Google Search) globally, and shipped in the Gemini API — all simultaneously. That breadth of deployment on day one is a genuine competitive advantage. OpenAI can ship a model into the API; it cannot, the same day, make it the answer engine for the world's largest search product. Google's distribution moat — AI Mode in Search alone reportedly past a billion monthly users — is the real lever here, and it activated the moment Flash went live.
On the capability side, Google's own framing is that Flash beats Gemini 3.1 Pro on coding, agentic and multimodal benchmarks. A cited data point is 76.2% on Terminal-Bench 2.1. Terminal-Bench measures long-horizon coding tasks with real shell interaction — a genuinely hard benchmark, not a multiple-choice proxy. A 'Flash' model — the fast, affordable tier — clearing the bar of last generation's 'Pro' would represent a real compression of the capability-to-cost curve. The key caveat: these figures come from Google or Google-proximate third parties; independent replication is the next step before treating them as settled. The history of benchmarking in this industry does not reward uncritical acceptance.
Google introduced Gemini 3.5 Flash at I/O 2026 as a faster and cheaper model for AI agents and coding, positioning it as beating the previous Pro tier on key agentic and multimodal benchmarks.
The reach question matters as much as the capability question. When Google makes a model the default in AI Mode in Search, it is not A/B-testing a small slice of traffic — it is routing the majority of AI-powered search interactions through that model, globally. That is a product decision with enormous downstream effects on which vendor's model shapes how hundreds of millions of people first encounter AI-generated answers. Flash's performance in that context — latency, hallucination rate, refusal behaviour — will be more consequential than any benchmark number, and it will be measured by Google's own internal metrics, not by external evaluation. What developers and users should watch for is whether the quality of AI Mode answers noticeably improves or regresses over the coming months; that is the real-world test no lab benchmark can substitute.
What didn't ship: Pro, and what that tells you
Gemini 3.5 Pro was the other half of the keynote — Google's next flagship tier, aimed at the longer-horizon, more deliberate tasks Flash isn't built for. But Google did not release it. Its own words in the announcement: 3.5 Pro is 'already being used internally, and we look forward to rolling it out next month.' So as of launch it was an internal model with a stated 'next month' target, not something a developer or user could touch. The gap between announcement and availability is real, and slipping past 'next month' would carry credibility risk — but note what Google did not publish: no context-window figure, no benchmark numbers, no pricing for 3.5 Pro. The flagship is, for now, a name and a promise.
Why announce a model that isn't ready? The most charitable read is operational: I/O is Google's annual developer showcase, and leaving the Pro narrative to a low-key API changelog would waste the platform. The less charitable read is competitive: Google I/O runs against the OpenAI spring schedule; naming 3.5 Pro — even without releasing it — anchors the conversation and gives developers a roadmap that keeps them on Google's stack. Both can be true simultaneously. What matters for builders is the distinction: Flash is available now; Pro is a roadmap item with no hard date as of this writing.
Flash vs Pro vs last gen — the numbers
Set the new Flash and the announced Pro side by side — with the honest gaps where Google published nothing:
| Gemini 3.5 Flash (shipped) | Gemini 3.5 Pro (announced) | |
|---|---|---|
| Status | GA — default for Gemini app + AI Mode in Search | Internal only; 'rolling out next month' |
| Terminal-Bench 2.1 | 76.2% (Google's figure) | Not disclosed |
| GDPval-AA / MCP Atlas | 1656 Elo / 83.6% (Google) | Not disclosed |
| Context window | ~1M tokens (1,048,576 in / 65,536 out) | Not disclosed |
| API input/output | $1.50 / $9 per M (cached in $0.15/M) | Not announced |
| App + Search default | Yes (from 19 May) | No |
The table's first lesson: the only model you can actually use is Flash. Its second lesson is the blank right-hand column — Google announced 3.5 Pro with no benchmark, no context window and no price, which is exactly why it should be read as a roadmap signal, not a shipped capability.
3.5 Flash is now the default model for the Gemini app and AI Mode in Search globally — Google's strongest agentic and coding model yet, outperforming Gemini 3.1 Pro on benchmarks like Terminal-Bench 2.1 (76.2%).
What to watch next
Three things worth tracking from here. First, independent benchmark replication of Flash's claimed Terminal-Bench 2.1 score — Artificial Analysis and LMSYS are the usual arbiters; their numbers will tell you how much of the 76.2% is signal versus staging. Second, when and whether 3.5 Pro actually ships: Google said 'next month'; if that slips to Q3 or later, that is a meaningful signal about its production pipeline for the full model — and the first hard data (benchmarks, context window, price) will only land when Pro does. Third, developer pricing response: at $1.50/M in and $9/M out, output-heavy agentic apps on Flash will feel the bill; if enterprise builders migrate to cheaper alternatives, that will show up in third-party API usage reports within a quarter. Watch those three and you'll know far more about this announcement's durability than any benchmark table can tell you today.
Frequently asked questions
Did Google release Gemini 3.5 Pro?
Is Gemini 3.5 Flash really better than Gemini 3.1 Pro?
How much does Gemini 3.5 Flash cost?
Why did Google announce Pro without releasing it?
Which Gemini model should I be using now?
Sources
- Gemini 3.5: frontier intelligence with action — Google, 19 May 2026
- Google introduces Gemini 3.5 Flash at I/O 2026 — a faster, cheaper model for AI agents and coding — MarkTechPost, 20 May 2026
- Google Search's I/O 2026 updates: AI agents and more — Google, 19 May 2026
- 100 things we announced at Google I/O 2026 — Google, 19 May 2026