TECH

Gemini Robotics-ER 1.6 powers Boston Dynamics Spot

26+

Signals

Strategic Overview

01.
Google DeepMind released Gemini Robotics-ER 1.6 on April 14, 2026, an upgraded embodied reasoning model available to developers through the Gemini API and Google AI Studio.
02.
Boston Dynamics integrated the model into Spot's Orbit software stack, specifically the AI Visual Inspection (AIVI) and AIVI-Learning systems, with rollout to enrolled customers beginning April 8, 2026.
03.
The integration unlocks autonomous instrument reading, agentic vision, multi-view scene understanding, and hazard identification on a quadruped already deployed at several thousand industrial sites.
04.
DeepMind frames ER 1.6 as its safest robotics model to date, citing superior compliance with safety policies on adversarial spatial-reasoning evaluations.

From 23% to 93%: The Agentic Vision Loop That Cracked Instrument Reading

The headline figure in Gemini Robotics-ER 1.6 is the jump in instrument-reading accuracy from roughly 23% under ER 1.5 to 93% with agentic vision, a near-fourfold improvement on the single capability industrial customers care about most. Crucially, the base model without agentic vision scores 86%; the remaining seven points come from letting the model iteratively re-look at the gauge, zoom, crop and re-reason before committing to a reading. That inner loop is the real delta, and it was shaped, by DeepMind's own admission, through direct collaboration with Boston Dynamics on actual industrial faceplates rather than synthetic benchmarks.

The broader scorecard tells a consistent story about where the model is strong and where it is still thin. Pointing and counting land at 80% success, single-view success detection reaches 90%, multi-view success detection sits at 84%, and hazard identification edges out the general-purpose Gemini 3.0 Flash by six points on text and ten points on video. Those are the sub-skills a roving inspection robot needs stitched together: first point at the right valve, then count the right number of them, then read the gauge, then decide whether what it saw is a hazard. ER 1.6 is the first model in the family that hits acceptable numbers on every leg of that chain simultaneously, which is why the Spot integration arrived the same week the model did rather than months later.

The Tactile Ceiling Vision Models Can't See Through

Under the confident benchmark numbers sits a quiet admission from Carolina Parada that reframes the whole category: the internet simply does not contain enough tactile data to pretrain on. "There is lots of information on the web about how to pick up a pen," she told Spectrum, "but there is not a lot of data with touch sensing on the internet." Vision-language-action models are so strong today precisely because YouTube, image captions and instructional text give them a bottomless corpus about how the world looks. Nothing comparable exists for how the world feels, weighs, slips, or resists.

That matters for the Spot story because industrial inspection is the subset of robotics where vision-only reasoning is genuinely sufficient. Read a gauge, check a leak, count a drum: all of it lives in pixels. The Atlas humanoid roadmap announced at CES 2026 does not enjoy the same luxury. Loading laundry, opening a latched door or plugging in a cable are tactile-first tasks, and no amount of ER 1.6-style agentic re-looking will close that gap. DeepMind's decision to lead with Spot and inspection is therefore not just a product choice but a data-physics choice: they are shipping where the training distribution already supports high reliability, and staying quieter about the manipulation problems where it does not.

Strategist Plus Executor: How the Community Reverse-Engineered the Stack

Perhaps the most interesting framing of this launch came not from DeepMind's blog but from robotics-focused Reddit threads, which quickly converged on a two-brain reading of the architecture. In their decomposition, ER 1.6 functions as the strategist, an embodied reasoner that plans, points, counts and decides what success looks like, while Gemini Robotics 1.5 VLA continues to serve as the motor-level executor that turns those plans into joint trajectories on the robot. Boston Dynamics' own engineering post reinforces this read by emphasizing that Gemini operates only through predefined Spot APIs and "can't invent new capabilities or control Spot beyond what is available." The planner is big and probabilistic; the action vocabulary underneath it is small and vetted.

That split is a significant departure from the monolithic end-to-end VLA dream that dominated robotics discourse a year ago. It suggests the commercially shippable path is not one giant model that maps pixels directly to motor torques, but a high-capability reasoner tethered to a narrow, well-characterized execution API. The upside is predictability, which is exactly what industrial buyers need before they let a robot near a live instrument panel. The downside, which some physical-AI commentators have flagged, is that each new capability still requires plumbing on the Boston Dynamics side; the strategist can only ask for moves the executor already knows how to perform.

Why 93% Is Simultaneously a Triumph and a Showstopper

On paper, a jump from 23% to 93% instrument-reading accuracy is the kind of step-change that justifies a product launch. In robotics-focused community threads it read differently. One frequently upvoted line, "93 percent is not going to cut it in industrial inspection," captured a practitioner instinct that seven percent residual error, applied across thousands of gauges per shift at a chemical or energy site, is not an engineering rounding error but an unacceptable operational liability. Darker jokes in the same threads imagined Gemini cheerfully confirming that a sulfuric acid pump was fine, minutes before it wasn't.

Marco da Silva's own framing, that Spot reaches acceptable field performance around 80% accuracy, sits uncomfortably next to that skepticism, because the number that matters in practice is not raw accuracy but the ratio of false positives to true catches. Crying wolf on an inspection route forces humans to re-check everything and erodes the autonomy gain the robot was sold on. A separate contrarian thread, led by a commenter asking simply "why not just use digital gauges," argues the whole category is a workaround for legacy PLC-and-analog infrastructure; rebuttals pointed out that retrofit costs, especially in government and heavy-industry settings, dwarf the cost of buying a Spot. The real read is that ER 1.6 has not ended the inspection-automation debate. It has moved it from "can a robot read a dial at all" to "at what false-alarm rate does a customer renew," which is a much more commercial, and much more interesting, argument.

Historical Context

2026-01

At CES 2026 the two companies announced a broader AI partnership bringing Gemini Robotics foundation models to the Atlas humanoid, a reunion nearly a decade after Google sold Boston Dynamics to SoftBank.

2026

Earlier in the partnership, Boston Dynamics demonstrated Spot using the prior Gemini Robotics-ER 1.5 model to execute household-style tasks via conversational commands instead of scripted state machines.

2026-04-08

Orbit's AIVI-Learning system quietly began rolling out Gemini Robotics integration to enrolled customers, a week ahead of the public Google DeepMind announcement.

2026-04-14

DeepMind released Gemini Robotics-ER 1.6, claiming measurable gains over ER 1.5 across pointing, counting, success detection, and instrument reading, and positioned it as the safest robotics model to date.

Power Map

Key Players

Subject

Gemini Robotics-ER 1.6 powers Boston Dynamics Spot

Google DeepMind

Developer of Gemini Robotics-ER 1.6; controls the model weights, API access, and the safety benchmarks (ASIMOV) that gate how aggressively partners can deploy physical AI.

Boston Dynamics

Manufacturer of Spot and the Orbit/AIVI platform; provides the fleet footprint, industrial customer relationships, and the instrument-reading problem that shaped ER 1.6's most-hyped capability.

Marco da Silva

VP and General Manager of Spot at Boston Dynamics; owns the commercial trade-off between autonomy gains and the false-alarm rate that makes or breaks inspection contracts.

Carolina Parada

Senior Director of Robotics at Google DeepMind; steers how embodied reasoning is scoped given that the web has almost no tactile training data, a constraint she has publicly acknowledged.

Google Cloud

Infrastructure partner for the Spot-Gemini integration; hosts the inference and data plumbing that makes AIVI-Learning a subscription-shaped product rather than a demo.

THE SIGNAL.

Analysts

""Capabilities like instrument reading and more reliable task reasoning will enable Spot to see, understand, and react to real-world challenges completely autonomously." Da Silva notes that Spot only clears the commercial bar at roughly 80 percent accuracy in the field, where the real enemy is false alarms rather than missed detections."

Marco da Silva

VP and GM of Spot, Boston Dynamics

""There is lots of information on the web about how to pick up a pen. If we had enough data with touch information, we could easily learn it, but there is not a lot of data with touch sensing on the internet." Parada concedes that vision-only pretraining has a structural ceiling that even a better embodied reasoner cannot fully offset."

Carolina Parada

Head of Robotics, Google DeepMind

""If you ask the robot to bring you a cup of water, it will reason not to place it on the edge of a table where it could fall." She frames the ASIMOV benchmark as the way DeepMind is forcing semantic safety into the planning layer, not just the motor layer."

Carolina Parada

Head of Robotics, Google DeepMind

""It can't invent new capabilities or control Spot beyond what is available through the API. This keeps Spot's behavior predictable." The team is explicit that Gemini acts strictly as a planner on top of a fixed, vetted action vocabulary."

Boston Dynamics Engineering

Official engineering blog, Boston Dynamics

The Crowd

"We teamed up with @BostonDynamics to power their robot Spot with Gemini Robotics embodied reasoning models. This means it can better understand its surroundings, identify objects and follow simple commands - like tidying up a room."

@@GoogleDeepMind1200

"The introduction of AI Visual Inspections expanded what Spot and Orbit could tell you about your facility - now, AIVI-Learning powered by @GoogleDeepMind Gemini Robotics unlocks a whole new level of visual intelligence for your robot. Learn more: bosdyn.co/4stB97V"

@@BostonDynamics263

"Google's new AI model just taught robots to read industrial gauges — jumping from 23% to 93% accuracy. #Gemini Robotics-ER 1.6 lets Boston Dynamics' Spot inspect facilities fully autonomously, interpreting pressure gauges and digital displays without any human intervention."

@@FrontierbeatHQ17

"Google DeepMind launches Gemini Robotics ER 1.6, a reasoning-first model that enables robots to understand environments through spatial reasoning and multi-view understanding"

@u/Nunki08214

Broadcast

The To Do List with Spot | Boston Dynamics

Smarter Inspections Powered by Google Gemini Robotics | Boston Dynamics

The New Brain for Robots: Gemini ER 1.6 Explained