May 29, 2026

Agentic Brew Daily

Your daily shot of what's brewing in AI

Fresh Batch

Distilled trend

Anthropic's $965B valuation and same-day Opus 4.8 launch arrive as enterprise burn complaints mount, signalling token economics not capability is now the binding constraint.
Dynamic Workflows ships 1,000-subagent orchestration the same week ITBench-AA shows frontier models below 50% on enterprise SRE tasks and more turns reduce accuracy.
SQLite's no-agentic-code policy lands alongside TriMem, sleep-style recurrence, and Airtable's HNSW work, suggesting long-horizon memory is the next gating problem.

Bold Shots

Today's biggest AI stories, no chaser

Anthropic launches Claude Opus 4.8 with Dynamic Workflows

Claude Opus 4.8 shipped May 28, just 41 days after 4.7, at the same $5/$25 list price and with day-one availability on claude.ai, Claude Code, Bedrock, Copilot, and Cursor. The headline feature is Dynamic Workflows inside Claude Code, letting a JavaScript script spawn up to 16 concurrent subagents and 1,000 agents per run for codebase-scale tasks. Fast mode is 2.5x faster and 3x cheaper at $10/$50, and Anthropic disclosed a $65B Series H at a $965B post-money valuation the same day. Artificial Analysis measured 15% fewer passes and 35% fewer output tokens per task vs 4.7.

Why it matters: Dynamic Workflows turns Claude Code from a per-file assistant into a scripted engineering process that can run a codebase-scale migration in one shot. The flat list price plus 3x cheaper fast mode reads as a coordinated push to lock in developer surface area ahead of the Mythos-class models.

On May 27 Robinhood opened beta access to Agentic Trading and an Agentic Credit Card, letting third-party AI agents execute trades and credit card purchases on a customer's behalf via Model Context Protocol endpoints. Agentic Trading runs in a separate self-directed account starting with equities, then options, crypto, event contracts, and futures. The Agentic Credit Card is a virtual card linked to the Robinhood Gold Card with 3% cash back and either per-transaction approval or a hard monthly cap. The endpoints work with Claude, Cursor, and OpenAI Codex out of the box.

Why it matters: A US brokerage publishing open MCP endpoints that live inside other vendors' agent runtimes flips the consumer-fintech default. The real prize, as analyst Richard Crone notes, is the structured pre-transaction intent data — every routed prompt is an investor reasoning step before money moves, something banks have never had.

Snowflake commits $6B to AWS for agentic AI infrastructure

Snowflake announced a five-year $6B strategic collaboration with AWS on May 27, underpinning Cortex AI with Graviton ARM CPUs and GPU-accelerated EC2 instances so enterprises can run agentic AI workloads on governed data inside Snowflake's perimeter without moving it. Q1 FY2027 came in at $1.39B revenue (+33% YoY) with full-year guidance raised to ~$5.84B, and shares jumped ~36% after-hours. Snowflake customers doubled AWS Marketplace spend to $2B in 2025, and Graviton4 ships 192 Arm Neoverse V3 cores per socket.

Why it matters: The chip story here is a CPU story, not a GPU one — agentic workloads shift the cost center from inference seconds to orchestration cycles (SQL, Python functions, vector lookups the model calls), and those run on CPU. Snowflake's $6B Graviton commitment is the first major enterprise-data-platform receipt for AWS's claim that its silicon beats Nvidia on price-performance.

Cognition raises $1B+ at $26B as Devin writes 89% of its own code

Cognition raised more than $1B at a $26B post-money valuation in a Series D announced May 27, co-led by Lux Capital, General Catalyst, and 8VC. Annualized revenue moved from $37M in May 2025 to about $492M in May 2026 — roughly 13x in twelve months — with enterprise usage up 50% MoM for six straight months. The most striking internal stat: Devin now drafts 89% of Cognition's own engineering commits, up from ~13% in December 2025. Customer list spans Goldman Sachs, Mercedes-Benz, NASA, Santander, Citi, Dell, the US Army, and the US Navy.

Why it matters: The signal isn't ARR, it's the recursive loop — Cognition using Devin to ship Devin compresses the engineering cost curve below anything copilot-style tools can match. Caveat: humans still review every Devin PR, so 89% of commits is 89% drafted by an agent and approved by a human.

Apple overhauls Siri and previews iOS 27 AI at WWDC 2026

Apple will unveil an overhauled Siri at WWDC on June 8: a chat-style interface, a standalone app supporting voice and text, and deeper Dynamic Island integration. Siri is reportedly powered by a custom 1.2-trillion-parameter Gemini variant licensed from Google for ~$1B/year, running inside Apple's Private Cloud Compute and being distilled into smaller on-device variants. iOS 27 adds a system-wide "Search or Ask" panel, a Siri mode in Camera, and generative Photos tools. Gene Munster pegs the multi-year deal at as much as $5B total.

Why it matters: Apple has stopped pretending its in-house foundation models can carry Siri — paying Google ~$1B/yr for a 1.2T-parameter teacher quantifies the capability gap. The architectural consolation is that Gemini runs on Apple's Private Cloud Compute so no user data leaves Apple silicon, but Apple's AI roadmap is now tied to Google's release cadence.

Slow Drip

Blog reads worth savoring

Analysis · Cloudflare EngineeringHow we built Cloudflare's data platform and an AI agent on top of it

Architecture-level walkthrough of Town Lake plus Skipper showing how default-deny governance, Code Mode MCP, and memory layers turn NL-to-SQL into an auditable internal tool.

Research · Latent SpaceESMFold2: The Bitter Lesson is Coming for Proteins — Alex Rives, BioHub

Named-lab interview on how a 2.8B-sequence transformer beats AlphaFold3 on antibody interactions and ships a 6.8B open protein atlas.

Research · Hugging Face BlogITBench-AA: Frontier Models Score Below 50% on the First Benchmark for Agentic Enterprise IT Tasks

Hard data on why Claude Opus 4.7 tops out at 47% on Kubernetes SRE root-cause tasks, with the counterintuitive finding that more investigation turns hurt accuracy.

The Grind

Research papers, decoded

Agent Systems41 upvotes · alphaxiv

The MiniMax-M2 Series: Mini Activations Unleashing Max Real-World Intelligence

A 229.9B-parameter MoE that activates only 9.8B per token, built end-to-end for agentic deployment. Contributes verifiable agent-trajectory data pipelines, an RL system ("Forge") with windowed-FIFO scheduling and prefix-tree merging, and a self-evolving M2.7 checkpoint hitting 56.2 on SWE-bench Pro and 94.2 on AIME 2026.

Agent Systems96 upvotes · alphaxiv

SkillOpt: Executive Strategy for Self-Evolving Agent Skills

Treats an agent's natural-language skill document as the trainable external state of a frozen LLM and optimizes it with disciplined add/delete/replace edits gated by held-out validation. Lifts GPT-5.5 by +23.5 points in direct chat, +24.8 inside Codex, and +19.1 inside Claude Code; optimized skills transfer across models and harnesses. If you ship Claude Code or Codex skills, this is a recipe for validation-gated gains.

The Mill

Builder tools ground for action

The Counter

Voices from the AI bar today

11K views

Comprehend First, Code Later: The AI Skill I Rely On Daily

A Sentry engineer analyzed 116 of her own Claude sessions: 67% were comprehension and only 2% generation. Introduces a "Catch Me Up" skill with six exploration modes for understanding legacy code before letting the agent plan.

AI Engineer

10K views

Harness Engineering: What Separates Top Agentic Engineers Right Now

Defines "harness engineering" — the ~98% of a tool like Claude Code that isn't the model — and shows how elite agentic engineers evolve their harness layer.

Cole Medin

34K views

The acceleration is here!

Walks through Google's Co-scientist and the Robin agent system autonomously surfacing novel treatments for leukemia, liver fibrosis, macular degeneration, and antibiotic-resistant infections.

AI Search

12K likes / 1.7K RTs / 625 replies / 1.3M views

We've raised $65 billion in Series H funding at a $965 billion post-money valuation, led by @AltimeterCap, Dragoneer, @Greenoaks, and @sequoia.

Anthropic's official Series H announcement, with run-rate revenue crossing $47B.

@AnthropicAI

13K likes / 1.4K RTs / 572 replies / 2.6M views

NEW: AI consultant reveals a client accidentally spent $500,000,000.00 in a single month after failing to set employee limits on Claude usage.

The viral $500M-Claude-burn story making the rounds — fits the broader thread that token economics is the new binding constraint.

@Polymarket

2.6K upvotes · 135 comments

Anthropic officially launched 13+ FREE AI courses with certificates (Including Agentic AI and Claude Code!)

Direct, actionable list of free Anthropic training tracks — MCP, Claude Code 101, Agentic AI, Bedrock and Vertex deployment — all with certificates.

r/ClaudeAI

1.3K upvotes · 254 comments

DeepSeek just popped the American AI bubble

Side-by-side per-token pricing showing DeepSeek V4 Pro at $0.435 input / $0.87 output — roughly 11.5x cheaper than GPT-5.5 input and 34.5x cheaper on output.

r/OpenAI

Roast Calendar

Your AI week, day by day

Fri29

12:00 PM PT•San Francisco

Production AI with Metaflow Meetup at DoorDash

5:00 PM PT•Mountain View

Gemini Meetup

5:30 PM PT•Pleasanton

Code, Context, Agentic Infrastructure: Securing AI Services

Sat30

May 30 – May 31•San Francisco

Web Data UNLOCKED — Enterprise AI Two-Day Hackathon

8:00 AM PT•San Francisco

Autoresearch Systems Hackathon with Modal, OpenAI, Raindrop & Antler

2:00 PM PT•San Francisco

Robotics & World Models Reading Club 10

Sun31

10:00 AM PT•Stanford

GDG Stanford Hackathon — Win up to $5M in seed funding

1:00 PM PT•San Francisco

AI Paper Reading Club

2:00 PM PT•San Jose

AI Demo Day

Mon1

12:00 PM PT•Stanford

Stanford OpenLab Seminar: How AI Is Automating GPU Design (Voltai)

5:30 PM PT•San Francisco

MCP Connect San Francisco with Sentry, Bitmovin and Alpic

5:30 PM PT•San Francisco

Agents That Take Action (Arcade.dev)

Tue2

All day•San Francisco

NVIDIA VIP Developer Experience

5:30 PM PT•San Francisco

On Device. Built for Real Work. An Evening at Google

6:00 PM PT•San Francisco

Strange Evals — ProgramBench

Wed3

4:00 PM PT•Stanford

Stanford OpenLab Seminar with Eric Peter, Enterprise Agent, Google Gemini

5:15 PM PT•San Francisco

OpenClaw: After Hours @ GitHub

6:30 PM PT•San Francisco

Agentic Analytics Demo Night (dltHub + Inngest)

Thu4

2:00 PM PT•San Francisco

Campfire x Replit x J.P.Morgan SF Finance Hack Lab

4:45 PM PT•San Francisco

Deploying Agents for Enterprises (FailSafe, NEAR AI, AWS, Coinbase, Straiker, Levro)

6:00 PM PT•San Francisco

Forward Deployed: Voice AI — The Next Frontier

Last Sip

Parting thoughts

A model release, a brokerage handing its API to other people's agents, a $6B CPU bet, a $26B coding-agent valuation, and Apple quietly outsourcing Siri's brain to Google — all in one 48-hour window. The through-line, if you squint, is that the interesting battle has moved one layer up the stack: away from raw model quality and into orchestration runtimes, MCP endpoints, harness design, and the long-horizon memory papers landing on alphaxiv. Worth keeping in mind alongside the ITBench-AA result that more agent turns can make accuracy worse. Enjoy the long weekend if you've got one — and if you're in SF, the calendar this week is genuinely stacked.

Agentic Brew Daily

Fresh Batch

Bold Shots

I think Anthropic and OpenAI have found product-market fit

Anthropic Growth and Bedrock Mix Drive AWS Margins Higher While Peers Lag

Excited to share our most powerful new Claude Code feature: dynamic workflows!

Embrace long-running tasks with Opus 4.8 and Claude Code

Claude Opus 4.8 actually blew my mind...

Introducing Claude Opus 4.8

Anthropic releases Claude Opus 4.8, promising a more honest model

Slow Drip

The Grind

The Mill

The Counter

Roast Calendar

Last Sip