TECH

Google I/O 2026: Gemini 3.5, Omni, Spark and the agentic Google stack

165+

Signals

Strategic Overview

01.
At Google I/O 2026 on May 19, Google released Gemini 3.5 Flash, its strongest model yet for agents and coding, generally available the same day across Antigravity, the Gemini API, the Gemini app, and AI Mode in Search.
02.
Google introduced Gemini Spark, a 24/7 cloud-resident personal AI agent that keeps working when the laptop is closed, powered by Gemini 3.5 on the Antigravity harness, with native links to Gmail, Calendar, Drive, Docs, Slides, plus MCP connections to Canva, OpenTable and Instacart.
03.
Gemini Omni Flash, a natively multimodal video-generation and conversational-editing model, started rolling out the same day to Google AI Plus, Pro and Ultra subscribers, plus free access via YouTube Shorts and the YouTube Create app.
04.
Google restructured AI subscriptions with a new $100/month AI Ultra tier (5x Pro usage limits, YouTube Premium Lite, Spark access) and cut the existing top Ultra tier from $250 to $200, while launching Antigravity 2.0 with a standalone desktop app, CLI, SDK, and Managed Agents in the Gemini API.

The 'Flash beats Pro' inversion: why a small model now runs the agents

Google's headline trick at I/O 2026 isn't that Gemini 3.5 Flash exists — it's that Flash is now stronger than the prior flagship. DeepMind reports 3.5 Flash outperforming Gemini 3.1 Pro on 'nearly all the benchmarks,' including TerminalBench 2.1 at 76.2%, MCP Atlas at 83.6% and CharXiv reasoning at 84.2% ^[1]. The reason it matters is structural: in long-horizon agentic workflows you don't make one expensive model call, you make thousands of cheap ones. Tulsee Doshi laid out the new pattern explicitly: '3.5 Pro becomes your orchestrator, your planner, and then it actually can leverage Flash to be the various sub-agents' ^[2]. So Flash isn't a cheaper consolation prize — it's the worker that fans out under a planner, the unit a developer pays per token to run a hundred copies of in parallel.

The live demo made the math concrete. Google had agents on Antigravity 2.0 and Gemini 3.5 Flash build a working operating system from scratch: 12 hours of wall time, 93 parallel sub-agents, 15,000+ model requests, 2.6 billion tokens processed, less than $1,000 in API credits. That's not a chatbot demo. It's a budget line item for an entire engineering project. According to Artificial Analysis, Flash runs at around 280 tokens/second versus roughly 60-70 tokens/second for GPT-5.5 and Claude Opus 4.7 ^[3]. Combine 4x speed with a third of the price and the unit economics for long-horizon agents tilt sharply toward Google — every parallel sub-agent gets cheaper and finishes sooner. The strategic bet behind the relabeling is that the future profitable AI workload is not 'one user one chat' but 'one user a thousand background calls,' and Google is repricing the tier its competitors haven't repriced yet.

Spark and the redesigned Search box: Google folds the assistant back into Google

Gemini Spark and the AI-powered Search redesign are the consumer face of the same agentic bet. Spark is a cloud-resident personal agent — Google's blog explicitly states 'because it is a cloud-based agent, Spark keeps working in the background even when you close your laptop or lock your phone' ^[4]. It runs on dedicated Google Cloud VMs, integrates natively with Gmail, Calendar, Drive, Docs, Slides and Maps, and adds MCP connections to Canva, OpenTable and Instacart on day one. Sundar Pichai described it as 'your personal AI agent that helps you navigate your digital life, taking action on your behalf and under your direction' ^[5]. Crucially, Spark is gated to U.S. Google AI Ultra subscribers in beta — meaning the $100/month tier is the entry ticket to the persistent-agent experience, not just to higher token limits.

The Search box redesign is the other half of the move and arguably the bigger one for Google's existing business. AI Mode is now powered by Gemini 3.5 Flash by default and expands to nearly 200 countries and 98 languages without a subscription, with what Google calls the biggest Search update in 25 years — an intelligent box that resizes for complex queries, accepts video, image, file and Chrome-tab inputs, and embeds Search agents that can complete purchases, check tickets and manage schedules in real time ^[6]. Read in isolation, Spark looks like a competitive answer to other persistent-agent products. Read alongside Search, it looks like something different: Google is collapsing the boundary between 'asking Google something' and 'asking your agent to do something,' on a surface that already has billions of users.

Follow the money: 7x token growth, a $100 Ultra tier, and the agent-tier price war

Pichai disclosed two numbers that explain the pricing math. First, Google now processes more than 3.2 quadrillion tokens per month, up roughly seven-fold year over year ^[7]. Second, the Gemini app has gone from 400 million monthly active users at I/O 2025 to over 900 million at I/O 2026, more than doubling in twelve months. That kind of demand curve makes a price cut a customer-acquisition move rather than a concession. Google introduced a new $100/month AI Ultra tier (5x the usage of Pro, Gemini 3.5 Flash, priority Antigravity access, 20TB storage, YouTube Premium Lite, and Spark for U.S. users) and cut the previous $250 top tier to $200 with 20x Pro usage limits and Project Genie access ^[8].

That $100 anchor sits directly on top of Anthropic's flagship Claude tier and roughly half of ChatGPT Pro's $200/month price ^[9]. The deeper threat is bundling: AI Ultra at $100 includes YouTube Premium Lite — meaning Google can defray subscription revenue inside the same plan that competes with Claude and ChatGPT. Markets were unimpressed in the short run — Alphabet shares moved less than a percent in pre-market on the day of the announcement ^[9]— but the strategic read is that Google is willing to subsidize the agent tier from search and YouTube cash flows in a way Anthropic and OpenAI can't match. Behind it all is the agent-economy thesis Pichai stated bluntly: 'We're firmly in our agentic Gemini era' ^[7].

The Antigravity stack: how Google is rebuilding its developer surface around agents

Antigravity 2.0 is the developer counterpart to Spark and is structured as a deliberate, four-layer agentic stack rather than a single product. The top layer is a new standalone desktop app, separate from the prior Antigravity IDE, designed around orchestrating multiple agents and running tasks in parallel ^[10]. Below that sits the Antigravity CLI, a lightweight, scriptable surface that shares the same agent harness as Antigravity 2.0 — letting developers spin up agents from a terminal without the desktop UI. The SDK provides programmatic access to that same harness, optimized for Gemini models so teams can define custom agent behavior and host on their own infrastructure. The deepest layer is Managed Agents in the Gemini API: a single API call provisions an isolated Linux environment that 'reasons, uses tools, and executes code,' with state persisting across follow-up calls so multi-turn sessions resume cleanly ^[10].

Around that core, Google also pushed the developer experience into product-building. AI Studio gained the ability to generate native Android apps from a single prompt, producing production-quality Kotlin code using the latest Jetpack Compose pattern, with an in-browser emulator, USB install via ADB, and a path to Google Play Console internal testing — all without a local SDK ^[11]. The combined effect: Google is moving the unit of developer output from 'code I write' to 'agents that write and run my code in a Google-managed sandbox.' That is also the structural reason developer communities have reacted warily — one widely upvoted megathread comment in r/google framed Antigravity 2.0 as a 'sophisticated trap built on convenience' that pulls agents, hosting and data into a single stack.

What the privacy backlash gets right

The most engaged consumer-side reactions on Reddit were not the celebratory ones. The top r/Android thread, with hundreds of upvotes, captured the dominant sentiment: skepticism about a 24/7 background agent reading email and calendar by default, and frustration that opt-in defaults feel performative. The r/google megathread surfaced the structural concern that Gemini Intelligence stitches Chrome, Gmail, Calendar, Maps and Pay into 'one seamless, full-spectrum view' — describing it as 'surveillance-by-convenience.' These reactions are not fringe. They're the second-order shape of the same architecture that makes Spark useful: a persistent agent only earns its keep when it has standing context, which means standing access.

The contrarian signal worth taking seriously: this is the first I/O where the announcement quality of the consumer story (Spark, AI Search, Omni) and the developer story (Antigravity 2.0, Managed Agents, AI Studio Android) are tightly coupled to the same identity surface. SEO and marketing voices in the Reddit threads were already reframing visibility around AI citations rather than rank — the new currency is being the source Google's AI picks, not the link ranked first. For end users the question shifts from 'do I trust this app' to 'do I trust Google's agent to act on my behalf with full inbox context'; for builders the question shifts from 'which API do I call' to 'whose harness am I deploying inside.' Google's distribution makes the answer default-yes for hundreds of millions of users — which is exactly why the design choices around consent and data scope at this layer will outlast the keynote demos.

Historical Context

2025-05

Reported 400 million monthly active users on the Gemini app at I/O 2025 — the baseline that has now more than doubled to 900 million in a year.

2026-04

At Cloud Next 2026, Ironwood (TPU v7) became generally available and Google previewed the eighth-generation split: TPU 8t (Sunfish, Broadcom) for training and TPU 8i (Zebrafish, MediaTek) for inference at TSMC 2nm, targeting late 2027.

2026-05-19

Google I/O 2026 keynote announced Gemini 3.5 Flash, Omni, Spark, Antigravity 2.0, the AI Search redesign and the restructured AI Ultra subscription tiers.

Power Map

Key Players

Subject

Google I/O 2026: Gemini 3.5, Omni, Spark and the agentic Google stack

Sundar Pichai

Alphabet/Google CEO. Framed Spark as a personal agent acting under user direction and Flash as a frontier-capable model at a third of competitor pricing, staking Google's distribution advantage on agents.

Google DeepMind

Built Gemini 3.5 Flash, Gemini Omni, and the Antigravity agent harness. Owns the strategic claim that frontier intelligence now ships in a low-latency Flash-tier model rather than only in Pro or Ultra.

OpenAI and Anthropic

Primary competitive targets. Google's $100 Ultra tier and Flash's speed-per-token advantage directly pressure ChatGPT Pro at $200 and Claude's enterprise plans, particularly on agentic and coding workloads.

Android developers and enterprise customers

Granted new build paths: native Android apps generated in AI Studio from a single prompt, Managed Agents on the Gemini API for isolated Linux execution, and Antigravity Enterprise tied directly to Google Cloud projects.

Warby Parker and Gentle Monster

Eyewear partners for Samsung/Google's Android XR intelligent glasses launching this fall, providing the consumer design wrapper around Gemini-powered live translation, navigation and notification summaries.

Fact Check

14 cited

Source Articles

Top 5

THE SIGNAL.

Analysts

"Said Gemini 3.5 Flash 'offers an incredible combination of quality and low latency' and 'outperforms our latest frontier model, 3.1 Pro, on nearly all the benchmarks,' arguing the model is 4x faster than rival frontier models."

Koray Kavukcuoglu

Chief Technology Officer, Google DeepMind

"Described the new architecture as a hierarchy: '3.5 Pro becomes your orchestrator, your planner, and then it actually can leverage Flash to be the various sub-agents' — i.e., the small fast model is now the agentic worker, not the chat model."

Tulsee Doshi

Senior Director, Google

"Said 'What's amazing about Flash is how it delivers frontier-level capabilities at less than half the price, in some cases almost a third of the price of comparable frontier models,' and on Omni, 'With world models, AI is moving from predicting text to simulating reality.'"

Sundar Pichai

CEO, Alphabet

"Placed Gemini 3.5 Flash just behind OpenAI and Anthropic's frontier models on the Intelligence index, but at roughly 280 tokens per second versus 60-70 for GPT-5.5 and Opus 4.7 — a 4x throughput advantage at lower price."

Artificial Analysis

Independent AI benchmarking firm

The Crowd

"Meet Gemini 3.5 Flash — our strongest agentic and coding model yet. It delivers frontier-level performance at 4x the speed of comparable frontier models — often at less than half the cost. Generally available, starting today. 🧵 #GoogleIO"

@@Google9518

"Introducing Gemini 3.5: our newest family of models combining frontier intelligence with real-world action. The first release is 3.5 Flash, our strongest model yet for agents and coding"

@@GoogleDeepMind3753

"Gemini Spark is your personal AI agent in the @GeminiApp that gets things done on your behalf, under your direction. It runs 24/7 (and yes - you can close your laptop). It's powered by Gemini 3.5 and built on the Google Antigravity harness so it can complete long horizon tasks. Spark will integrate seamlessly with tools, starting with ours, and soon with 3P tools with MCP. You'll also be able to work with it through email + chat. Available to trusted testers this week and next week in Beta to AI Ultra users in the US."

@@sundarpichai635

"Everything announced at Google I/O 2026... Makes me want to sell my phone."

@u/DynoMenace634

Broadcast

I/O '26 Recap: Everything You Need to Know

Everything Announced at Google I/O 2026 in 13 Minutes

Gemini 3.5 Flash: Google's Most Powerful Model Ever! Beats Opus 4.7 & GPT 5.5? (Fully Tested)