# Gemini 3.5 Flash ships at I/O — and the Pro that didn't is the more telling story

> Google launched Gemini 3.5 Flash at I/O 2026; Gemini 3.5 Pro was announced, not released.

*Google made the cheaper, faster model its default everywhere. The flagship it announced is still not out.*

By WireRead Editorial · WireRead
Canonical: https://wireread.com/news/gemini-3-5-flash-ships-pro-delayed

At **Google I/O on 19 May 2026**, Google did two things in a single announcement: it launched **Gemini 3.5 Flash** into general availability across every major Google surface, and it announced **Gemini 3.5 Pro** without releasing it. The blog title — *'Gemini 3.5: frontier intelligence with action'* — frames both as one story. They are not. Reading them as separate signals gives a much cleaner picture of what Google is actually doing with its model strategy.

## What shipped: Flash, everywhere, on day one

**Gemini 3.5 Flash** went live as the **default model** for the consumer Gemini app and for **AI Mode in Search** (the live AI-powered answer layer in Google Search) globally, and shipped in the **Gemini API** — all simultaneously. That breadth of deployment on day one is a genuine competitive advantage. OpenAI can ship a model into the API; it cannot, the same day, make it the answer engine for the world's largest search product. Google's distribution moat — AI Mode in Search alone reportedly past a billion monthly users — is the real lever here, and it activated the moment Flash went live.

On the capability side, Google's own framing is that Flash **beats Gemini 3.1 Pro** on coding, agentic and multimodal benchmarks. A cited data point is **76.2% on Terminal-Bench 2.1**. Terminal-Bench measures long-horizon coding tasks with real shell interaction — a genuinely hard benchmark, not a multiple-choice proxy. A 'Flash' model — the fast, affordable tier — clearing the bar of last generation's 'Pro' would represent a real compression of the capability-to-cost curve. The key caveat: these figures come from Google or Google-proximate third parties; independent replication is the next step before treating them as settled. The history of benchmarking in this industry does not reward uncritical acceptance.

> Google introduced Gemini 3.5 Flash at I/O 2026 as a faster and cheaper model for AI agents and coding, positioning it as beating the previous Pro tier on key agentic and multimodal benchmarks.
> — [MarkTechPost](https://www.marktechpost.com/2026/05/20/google-introduces-gemini-3-5-flash-at-i-o-2026-a-faster-and-cheaper-model-for-ai-agents-and-coding/), 2026-05-20

The reach question matters as much as the capability question. When Google makes a model the default in AI Mode in Search, it is not A/B-testing a small slice of traffic — it is routing the majority of AI-powered search interactions through that model, globally. That is a product decision with enormous downstream effects on which vendor's model shapes how hundreds of millions of people first encounter AI-generated answers. Flash's performance in that context — latency, hallucination rate, refusal behaviour — will be more consequential than any benchmark number, and it will be measured by Google's own internal metrics, not by external evaluation. What developers and users should watch for is whether the quality of AI Mode answers noticeably improves or regresses over the coming months; that is the real-world test no lab benchmark can substitute.

> **Key:** **Mind the price.** Official API pricing for Flash is **$1.50/M input and $9/M output**, with cached input at **$0.15/M**. Google frames Flash as costing 'less than half' of other frontier models — a relative claim, not an absolute floor. 'Flash' has historically been Google's cheap-and-fast tier, so anyone who treated it as commodity-priced should re-check their unit economics against these published rates before building at scale: $9/M output in particular is a meaningful number for output-heavy agentic workloads.

## What didn't ship: Pro, and what that tells you

**Gemini 3.5 Pro** was the other half of the keynote — Google's next flagship tier, aimed at the longer-horizon, more deliberate tasks Flash isn't built for. But Google did not release it. Its own words in the announcement: 3.5 Pro is *'already being used internally, and we look forward to rolling it out next month.'* So as of launch it was an internal model with a stated 'next month' target, not something a developer or user could touch. The gap between announcement and availability is real, and slipping past 'next month' would carry credibility risk — but note what Google did *not* publish: no context-window figure, no benchmark numbers, no pricing for 3.5 Pro. The flagship is, for now, a name and a promise.

Why announce a model that isn't ready? The most charitable read is operational: I/O is Google's annual developer showcase, and leaving the Pro narrative to a low-key API changelog would waste the platform. The less charitable read is competitive: Google I/O runs against the OpenAI spring schedule; naming 3.5 Pro — even without releasing it — anchors the conversation and gives developers a roadmap that keeps them on Google's stack. Both can be true simultaneously. What matters for builders is the distinction: **Flash is available now; Pro is a roadmap item with no hard date as of this writing**.

## Flash vs Pro vs last gen — the numbers

Set the new Flash and the announced Pro side by side — with the honest gaps where Google published nothing:

| | Gemini 3.5 Flash (shipped) | Gemini 3.5 Pro (announced) |
| --- | --- | --- |
| **Status** | GA — default for Gemini app + AI Mode in Search | Internal only; 'rolling out next month' |
| **Terminal-Bench 2.1** | 76.2% (Google's figure) | Not disclosed |
| **GDPval-AA / MCP Atlas** | 1656 Elo / 83.6% (Google) | Not disclosed |
| **Context window** | ~1M tokens (1,048,576 in / 65,536 out) | Not disclosed |
| **API input/output** | $1.50 / $9 per M (cached in $0.15/M) | Not announced |
| **App + Search default** | Yes (from 19 May) | No |

The table's first lesson: the only model you can actually use is Flash. Its second lesson is the blank right-hand column — Google announced 3.5 Pro with no benchmark, no context window and no price, which is exactly why it should be read as a roadmap signal, not a shipped capability.

> 3.5 Flash is now the default model for the Gemini app and AI Mode in Search globally — Google's strongest agentic and coding model yet, outperforming Gemini 3.1 Pro on benchmarks like Terminal-Bench 2.1 (76.2%).
> — [Google](https://blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-3-5/), 2026-05-19

## What to watch next

Three things worth tracking from here. First, **independent benchmark replication** of Flash's claimed Terminal-Bench 2.1 score — Artificial Analysis and LMSYS are the usual arbiters; their numbers will tell you how much of the 76.2% is signal versus staging. Second, **when and whether 3.5 Pro actually ships**: Google said 'next month'; if that slips to Q3 or later, that is a meaningful signal about its production pipeline for the full model — and the first hard data (benchmarks, context window, price) will only land when Pro does. Third, **developer pricing response**: at $1.50/M in and $9/M out, output-heavy agentic apps on Flash will feel the bill; if enterprise builders migrate to cheaper alternatives, that will show up in third-party API usage reports within a quarter. Watch those three and you'll know far more about this announcement's durability than any benchmark table can tell you today.

## Key takeaways

- Gemini 3.5 Flash launched at Google I/O 2026 (19 May) and immediately became the default model for the Gemini app and AI Mode in Search globally.
- Google says it beats Gemini 3.1 Pro on coding and agentic benchmarks — 76.2% on Terminal-Bench 2.1, 1656 Elo on GDPval-AA, 83.6% on MCP Atlas — but those are vendor-attributed figures awaiting independent replication.
- Official API pricing is $1.50/M input and $9/M output (cached input $0.15/M); Google frames Flash as 'less than half the cost of other frontier models'.
- Gemini 3.5 Pro was announced but never released; Google said it is 'already being used internally' and would roll out 'next month' — a roadmap item, not a shipped product.
- The two-part structure of the announcement — ship the fast model, promise the flagship — is a pattern worth understanding before you build on Google's AI stack.

## FAQ

### Did Google release Gemini 3.5 Pro?
No. Gemini 3.5 Pro was announced at I/O but not released. Google said it was 'already being used internally' and that it would roll it out 'next month'. Google published no benchmarks, context window or price for it. Only Gemini 3.5 Flash actually shipped to general availability.

### Is Gemini 3.5 Flash really better than Gemini 3.1 Pro?
Google says so, citing 76.2% on Terminal-Bench 2.1, 1656 Elo on GDPval-AA and 83.6% on MCP Atlas. Those figures are Google-attributed and have not yet been independently confirmed by third-party evaluators — treat them as credible claims under active verification.

### How much does Gemini 3.5 Flash cost?
Official API pricing is $1.50 per million input tokens, $9 per million output, and $0.15 per million cached input. Google frames Flash as costing 'less than half' of other frontier models, though it didn't publish a comparison to the prior Flash tier.

### Why did Google announce Pro without releasing it?
Google I/O is Google's annual developer conference — a natural moment to map out the full roadmap even before all models are ready. Announcing Pro also anchors developer expectations on Google's trajectory rather than allowing rivals to fill the narrative vacuum. The risk is that 'next month' needs to arrive on time.

### Which Gemini model should I be using now?
For general use, Gemini 3.5 Flash — it is the default for the Gemini app and AI Mode in Search, and Google claims it out-performs the previous Gemini 3.1 Pro tier on key tasks. Gemini 3.5 Pro is not yet available. Verify the published $1.50/$9 per-million pricing fits your budget before building at scale.

## Sources

- [Gemini 3.5: frontier intelligence with action](https://blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-3-5/) — Google, 2026-05-19
- [Google introduces Gemini 3.5 Flash at I/O 2026 — a faster, cheaper model for AI agents and coding](https://www.marktechpost.com/2026/05/20/google-introduces-gemini-3-5-flash-at-i-o-2026-a-faster-and-cheaper-model-for-ai-agents-and-coding/) — MarkTechPost, 2026-05-20
- [Google Search's I/O 2026 updates: AI agents and more](https://blog.google/products-and-platforms/products/search/search-io-2026/) — Google, 2026-05-19
- [100 things we announced at Google I/O 2026](https://blog.google/innovation-and-ai/technology/ai/google-io-2026-all-our-announcements/) — Google, 2026-05-19