Under the Hood: A Two-Tier Agent and the Toggle That Defines the Enterprise Pitch
Google's most consequential design decision with this release isn't a new model — it's the deliberate split of its research agent into two SKUs running on the same Gemini 3.1 Pro brain. Deep Research is tuned for low-latency, real-time client experiences where a user is waiting on the other end of a chat surface. Deep Research Max spends extended test-time compute on asynchronous, long-horizon synthesis, the kind of multi-hour investigation a human analyst would otherwise schedule. Same backbone, two operating points — and that bifurcation is itself a product statement: research is no longer a single feature, it's a primitive with a latency-cost-quality dial.
The second design choice quietly does more enterprise work than the headline numbers. Both tiers ship with Model Context Protocol (MCP) support, letting developers connect proprietary or third-party data systems, and — critically — both can disable web access entirely so the agent runs only against private corpora. That's the sentence regulated industries have been waiting for. Combined with multimodal inputs (PDFs, CSVs, images, audio, video), file-store connectors, real-time streaming of intermediate reasoning steps, and a collaborative planning step before the agent executes, the platform now looks less like a chatbot and more like a programmable analyst. Google's framing — 'search the web, arbitrary remote MCPs, file uploads and connected file stores — or any subset of them' — is the unlock: the same agent runs in a public-research mode for one customer and an air-gapped private-data mode for the next, without changing model strings.




