May 24, 2026

Agentic Brew Daily

Your daily shot of what's brewing in AI

Fresh Batch

Distilled trend

The agent stack is consolidating around context, sandboxes, and harnesses, with Google shipping Managed Agents while builders publish post-mortems on tool bloat and RAG defaults.
Anthropic is simultaneously the safety story and the security risk: Glasswing claims 10,000 vulnerabilities found while X warns that public skill marketplaces are the new supply-chain attack vector.
AI demand is repricing the physical layer, with DDR5 up nearly 10x and Micron pouring $2B into Virginia DRAM as HBM steals wafer capacity from consumer memory.

Bold Shots

Today's biggest AI stories, no chaser

Google I/O 2026: Gemini becomes a six-layer agentic stack

Sundar Pichai opened I/O 2026 declaring "the agentic Gemini era," shipping Gemini 3.5 Flash (76.2% on Terminal-Bench 2.1) as the new default and reframing every Google surface as a substrate for agents. Gemini Spark is a 24/7 personal agent that runs on Google Cloud VMs (not your laptop), gets its own Gmail address, and works across Gmail/Docs/Sheets/Slides while you're offline. Developers got Managed Agents in the Gemini API where one Interactions call spins up an ephemeral Linux sandbox running Bash, Python, and Node, and Search retired the ten-blue-links layout for generative UI. Within four days Adobe, Canva and CapCut shipped native Gemini integrations and SynthID verification rolled out across Search and Chrome.

Why it matters: This is the year Google stops competing as a chatbot and starts competing as an OS for agents. Spark's cloud-resident, Gmail-addressable design directly attacks the assumption that personal AI runs on your device, and the Search redesign is the single most consequential change to the open web in a decade.

Gemini 3.5 Flash Looks Good For How Fast It Is

Thezvi Substack

We're dropping Gemini Omni: our first step towards a model that can create anything from anything - starting with video.

@GoogleDeepMind·10.5K engagements

Gemini Spark is your new 24/7 personal AI agent. Give it a task and it works autonomously in the background, even if your phone a...

@GeminiApp·3.3K engagements

Jensen Huang told CNBC alongside Q1 FY2027 earnings that Nvidia has "largely conceded" China's AI chip market to Huawei. Nvidia reported $0 China data-center revenue and zero Hopper shipments into China for the quarter against $4.6B in the year-ago period, even as total revenue hit $81.6B (up 85% YoY). On May 22 Taiwan's Keelung District Prosecutors moved to detain three men for allegedly forging documents to ship Super Micro AI servers with Nvidia chips into Hong Kong, Macau, and the mainland; investigators seized about 50 servers and NT$9M in cash. Huang publicly urged Super Micro to tighten export-control compliance and pitched a new $200B Vera CPU TAM, with CFO Colette Kress guiding to $20B in 2026 CPU revenue.

Why it matters: Three years of US export controls have ended with Nvidia at zero China share and Huawei guiding to $12B in AI chip revenue this year. The Taiwan smuggling crackdown shows the gray market is large enough that even allies are now policing it, and Nvidia's CPU reframing is how the stock survives the China zero.

BREAKING — China Unveils Gaming GPU To Challenge NVIDIA

@PamphletsY·23.9K engagements

Nvidia says it has 'largely conceded' China's AI chip market to Huawei, per CNBC

@unusual_whales·2.5K engagements

The Death of the Nvidia Moat

Manolo Remiddi·8.6K views

Nvidia is in big trouble, as Huawei rolls out 5G and AI across the world

(unknown)·116K views

Nvidia says it has 'largely conceded' China's AI chip market to Huawei

r/technology·1.4K upvotes

Jensen Huang says Nvidia has "largely conceded" China's AI chip market to Huawei, yet zero H200 chips have actually shipped

r/investing·333 upvotes

Trump pulls AI safety executive order hours before signing

President Trump postponed the signing of an AI safety executive order on Thursday, May 21, hours before the planned Oval Office ceremony, after last-minute lobbying from tech leaders. The cancelled order would have set a voluntary framework requiring AI developers to submit frontier models for federal security review up to 90 days before release. Reporting reconstructs David Sacks leading the lobby; OpenAI and Anthropic backed the order while Meta and xAI led the push to kill it. Trump framed the reversal around not slowing the AI race against China. FLI polling cited by Fortune shows 79% of Republican voters favor pre-release government testing.

Why it matters: A frontier-lab fissure is now visible in policy. Incumbents that already do internal red-teaming wanted the order; challengers without that overhead killed it. The Silicon Valley veto power on display also runs straight against the MAGA base's own polling on AI oversight.

The White House is considering a slate of executive actions to address escalating security risks from advanced AI models, per 7 pp...

@jacob_wendler·50 engagements

NEW: President Trump abruptly delays the signing of a landmark executive order on AI, telling reporters that he had pulled the ord...

@NBCNews·43 engagements

Trump Kills AI Executive Order at the Last Minute: I Did not Like It

(unknown)·116.1K views

Here is why Trump postponed signing an executive order on AI

(unknown)·97.3K views

Donald Trump abruptly postpones AI order after White House infighting

r/singularity·116 upvotes

He just hates regulation Trump delays AI executive order that might hinder progress

r/accelerate·111 upvotes

Anthropic Project Glasswing finds 10,000+ critical vulns in one month

Anthropic launched Project Glasswing, powered by its unreleased Claude Mythos Preview model, which uncovered more than 10,000 high or critical-severity vulnerabilities in essential software in its first month, with 90.6% of triaged findings (1,587 of 1,752) confirmed as true positives. The bottleneck has shifted from discovery to verification and patching: high-severity bugs average a two-week patch time and more than 99% of Mythos-found vulnerabilities remain unpatched. Launch partners span AWS, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, Linux Foundation, Microsoft, NVIDIA, and Palo Alto Networks, backed by $100M in Mythos credits and $4M in OSS security donations. Notable discoveries include critical wolfSSL CVE-2026-5194 (5B+ devices), a 27-year-old OpenBSD bug, and a 16-year-old FFmpeg flaw.

Why it matters: Glasswing inverts the cybersecurity economy. Bug discovery is now infinite; remediation is the new scarce resource. Anthropic gating Mythos behind a 50-partner cartel also raises real antitrust and equity questions about who gets to defend critical software.

Anthropic just published the first Project Glasswing update. In one month, their unreleased AI found 10,000 critical security hole...

@VaibhavSisinty·2.1K engagements

Introducing Project Glasswing: an urgent initiative to help secure the world's most critical software. It's powered by our newest...

@AnthropicAI·46.1K engagements

An initiative to secure the world's software | Project Glasswing

(unknown)·383.4K views

Project Glasswing Explained: What Mythos Means for Cybersecurity

(unknown)·4.9K views

Project Glasswing: what Mythos showed us (Cloudflare)

r/ClaudeAI·271 upvotes

Glasswing gives 50 companies a 3-month head start on Mythos-class vulnerabilities. What does everyone else do?

r/cybersecurity·191 upvotes

NTSB locks down its docket after AI reconstructs UPS crash cockpit audio

The NTSB temporarily suspended public access to its entire online docket system on May 21 after internet users used AI to reconstruct cockpit voice recorder audio from a spectrogram image released in the UPS Flight 2976 investigation. Federal law bars NTSB from releasing raw audio, but the spectrogram (a visual frequency/time image) contained enough data for reconstruction. The pipeline combined the Griffin-Lim phase-recovery algorithm (published 1984) with modern AI tools including OpenAI's Codex, using the publicly available transcript as a prior. NTSB restored most of the docket on Friday but kept 42 investigations sealed pending review.

Why it matters: Decades-old data-release policies assumed the line between "visual" and "audio" was a real privacy boundary. AI coding agents have erased it. A graduate-level signal-processing pipeline that took weeks now takes hours, and every agency that ever published a spectrogram or frequency plot of restricted data has the same problem.

The NTSB is aware that advances in image recognition and computational methods have enabled individuals to reconstruct approximati...

@NTSB_Newsroom·1.8K engagements

Cockpit audio reconstructions of the Nov. 2025 UPS MD-11 crash have surfaced on sites like Reddit after a spectrogram file of the...

@willguisbond·200 engagements

Why The NTSB Shut Down Their Plane Crash Report Archive

(unknown)·120.6K views

UPS Flight 2976 Crash: New Footage and Transcripts Released

(unknown)·364.8K views

NTSB removes UPS Flight 2976 Spectrogram

r/aviation·921 upvotes

AI Was Used to Recreate the Voices of Dead Pilots. The NTSB Responded by Locking Down Its Database.

r/ArtificialInteligence·7 upvotes

Slow Drip

Blog reads worth savoring

Analysis · Machine Learning at Scale SubstackA 0.44 Recall Collapse That Looked Like 0.81 Global Success

Per-segment eval gating beats one aggregate number because in-batch negatives can make embeddings anisotropic and halve recall for a single client while global metrics look fine.

Analysis · Latent Space[AINews] All Model Labs are now Agent Labs

OpenAI, DeepSeek, and AI21 are reorganizing around agents this week, with DeepSeek's 75% price cut and managed sandboxes pointing at the new infra layer.

Tutorial · Product GrowthHow to Run Evals in Claude Code with Aparna Dhinakaran

Install Arize skills via `npx skills add Arize-ai/arize-skills` and let Claude propose, group, and iteratively repair failed evals as a self-improving loop.

Research · Hugging Face BlogTowards Speed-of-Light Text Generation with Nemotron-Labs Diffusion Language Models

Open-weight 3B/8B/14B diffusion LMs flip between autoregressive, diffusion, and self-speculation modes in one architecture for 6.4x faster decoding without retraining.

The Grind

Research papers, decoded

Reasoning models / evaluation8,707 upvotes · arxiv · X

The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity

Apple's team stress-tests Claude 3.7 Sonnet Thinking, DeepSeek-R1, and o3-mini on four controllable puzzle environments and finds three regimes: standard LLMs win at low complexity, LRMs win at medium, and both collapse to near-zero at high complexity. More damning, models reduce reasoning tokens as they approach failure, and handing them the explicit algorithm does not improve performance.

Inference-time scaling / recursive reasoning133 upvotes · alphaxiv

Generative Recursive Reasoning (GRAM)

GRAM injects stochasticity into recursive reasoning so models sample multiple latent trajectories in parallel instead of one deterministic path. Trained via amortized variational inference, it hits 97.0% on Sudoku-Extreme (vs 87.4% deterministic baseline) and 99.7% on N-Queens 8x8 while covering 90.3% of valid solutions, establishing 'width' as a new inference-time scaling axis.

Agent architecture / survey77 upvotes · alphaxiv

Code as Agent Harness

Reframes code as the operational substrate for agents with a three-layer taxonomy: Harness Interface (code as reasoning, action, environment model), Harness Mechanisms (planning, memory, tools, verification, self-optimization), and Scaling (multi-agent coordination via shared code artifacts). Useful design checklist for what to externalize as code vs keep in prompts.

The Mill

Builder tools ground for action

26K stars, +2.2K today

anthropics/claude-plugins-official

Anthropic-managed directory of vetted Claude Code plugins; the official answer to the security mess around third-party skill marketplaces.

18K stars, +2.4K today

colbymchenry/codegraph

Pre-indexed code knowledge graph for Claude Code, Codex, Cursor, OpenCode, and Hermes Agent that cuts tokens and tool calls by giving agents a structured map of the codebase.

20K stars, +2.3K today

Lum1104/Understand-Anything

Turns any code repo into an interactive, searchable knowledge graph — 'graphs that teach > graphs that impress.'

149K stars, +3.2K today

multica-ai/andrej-karpathy-skills

A single CLAUDE.md distilled from Karpathy's published observations on LLM coding pitfalls; viral one-file drop-in for Claude Code.

41K stars, +437 today

ChromeDevTools/chrome-devtools-mcp

Official Chrome DevTools MCP server letting coding agents drive a real browser for inspection and debugging.

420 votesProduct Hunt

TestSprite 3.0

Fleet of parallel agents that test your app in minutes; riding the agentic-QA wave.

Developer Tools / Artificial Intelligence

305 votesProduct Hunt

General Compute

AI models on an inference cloud optimized for speed; pitched as a Cerebras/Groq-style alternative for builders.

API / Software Engineering

The Counter

Voices from the AI bar today

33K views

Chip design from the bottom up – Reiner Pope

MatX CEO and ex-Google TPU architect walks from basic logic gates up through full chip architecture: rare technical depth on what goes into a competitive AI accelerator.

Dwarkesh Patel

23K views

AI Just Crossed The Line We Were Afraid Of: Continual Harness

Breakdown of Princeton's Continual Harness, an agent that self-improves during live execution by rewriting its own instructions, building tools, and storing memories.

AI Revolution

8.1K engagements

Stanford LLM-architecture lecture rec

Tonight's most-shared learning thread: one Stanford lecture teaches you more about how ChatGPT/Claude actually work than most engineers ever learn.

@Tabbu_ai

2K engagements

Open-source city-to-3D-model tool

Open-source tool that feeds in a city name and spits out a 3D model with buildings and streets via OpenStreetMap; 100% open source.

@HowToAI_

1.8K upvotes · 147 comments

11 Claude things I wish someone had told me 12 months ago

Practitioner cheatsheet of Claude Code workflow tips: compounding tribal knowledge readers actually save.

r/ClaudeAI

1.4K upvotes · 487 comments

$300M on Anthropic tokens, zero new engineers hired — Salesforce is the clearest case study of where this is going

Heated thread treating Salesforce's token spend as the canary for AI-replaces-headcount; 487 comments shows the nerve it hit.

r/ArtificialInteligence

Roast Calendar

Your AI week, day by day

Sun24

11:00 AM PT•San Francisco, CA

Inference Mode #2: DeepSeek-V4

3:00 PM PT•Los Altos Hills, CA

The Age of Agents: How AI x Web3 Is Reshaping Payments and Wealth

2:00 PM PT•Palo Alto, CA

The 0th World

Mon25

5:30 PM PT•Sunnyvale, CA

How an Agent-Native Language Can Make Agents More Reliable in Production

7:00 PM PT•San Francisco, CA

90/30 Club (ML reading) #54: TPU Performance

7:00 PM PT•San Francisco, CA

Robots, AMRs & Autonomous Systems Night

Tue26

5:00 PM PT•San Francisco, CA

Codex Community Hackathon - San Francisco #5

5:30 PM PT•San Francisco, CA

Operationalizing Agents with Google DeepMind, Snowflake & Google Research

6:00 PM PT•San Francisco, CA

Hard Problems Night for Agent Builders

Wed27

May 27 - May 29•Hackathon

Cursor Hackathon Toronto Tech Week

12:00 PM PT•Stanford, CA

Stanford OpenLab Seminar with Guido Appenzeller (a16z AI Infra GP)

5:30 PM PT•San Francisco, CA

AI Journal Club ft. Song Bian (NVIDIA)

Thu28

May 28 - May 30•Hackathon

HackCafe : May Edition

5:30 PM PT•San Francisco, CA

Founder's Hour @OpenAI

6:00 PM PT•Redwood City, CA

Excellence in Tech: AI Agents ft. Gabor Cselle (ex-OpenAI, Google)

Fri29

May 29 - May 30•Hackathon

Kane CLI Hack Day

5:00 PM PT•Mountain View, CA

Gemini Meetup

12:00 PM PT•San Francisco, CA

Production AI with Metaflow Meetup at DoorDash

Sat30

10:00 AM PT•San Francisco, CA

OpenAI Business Hackathon (w/ The AI Collective)

11:30 AM PT•San Francisco, CA

Kernel Camp Showcase

2:00 PM PT•Stanford, CA

Building & Investing in Fintech in the AI Wave: Fireside Chat with Jefferson Chen

Last Sip

Parting thoughts

The through-line of the day is the harness, not the model. Apple's reasoning-models paper says current LRMs collapse beyond a complexity cliff and spend fewer tokens as they approach it. Google answered the same problem at I/O by externalizing reasoning into a per-call Linux sandbox. Anthropic's Glasswing did it by letting Mythos churn through real codebases for a month and producing more bugs than humans can patch in a year. Meanwhile a 7M-param recursive model with a noise injector and a recycled Q-head beat frontier ensembles on PPBench for a tenth of a cent. The bet that's winning right now isn't a smarter model. It's an opinionated structure around an okay model that lets you compound effort. If you're picking what to invest in this week, that's probably the lens to use.

Agentic Brew Daily

Fresh Batch

Bold Shots

Gemini 3.5 Flash Looks Good For How Fast It Is

We're dropping Gemini Omni: our first step towards a model that can create anything from anything - starting with video.

Gemini Spark is your new 24/7 personal AI agent. Give it a task and it works autonomously in the background, even if your phone a...

I/O '26 Recap: Everything You Need to Know

Google I/O 2026 keynote in 35 minutes

Everything announced at Google I/O 2026... Makes me want to sell my phone.

I/O 2026 keynotes (megathread)

BREAKING — China Unveils Gaming GPU To Challenge NVIDIA

Nvidia says it has 'largely conceded' China's AI chip market to Huawei, per CNBC

The Death of the Nvidia Moat

Nvidia is in big trouble, as Huawei rolls out 5G and AI across the world

Nvidia says it has 'largely conceded' China's AI chip market to Huawei

Jensen Huang says Nvidia has "largely conceded" China's AI chip market to Huawei, yet zero H200 chips have actually shipped

The White House is considering a slate of executive actions to address escalating security risks from advanced AI models, per 7 pp...

NEW: President Trump abruptly delays the signing of a landmark executive order on AI, telling reporters that he had pulled the ord...

Trump Kills AI Executive Order at the Last Minute: I Did not Like It

Here is why Trump postponed signing an executive order on AI

Donald Trump abruptly postpones AI order after White House infighting

He just hates regulation Trump delays AI executive order that might hinder progress

Anthropic just published the first Project Glasswing update. In one month, their unreleased AI found 10,000 critical security hole...

Introducing Project Glasswing: an urgent initiative to help secure the world's most critical software. It's powered by our newest...

An initiative to secure the world's software | Project Glasswing

Project Glasswing Explained: What Mythos Means for Cybersecurity

Project Glasswing: what Mythos showed us (Cloudflare)

Glasswing gives 50 companies a 3-month head start on Mythos-class vulnerabilities. What does everyone else do?

The NTSB is aware that advances in image recognition and computational methods have enabled individuals to reconstruct approximati...

Cockpit audio reconstructions of the Nov. 2025 UPS MD-11 crash have surfaced on sites like Reddit after a spectrogram file of the...

Why The NTSB Shut Down Their Plane Crash Report Archive

UPS Flight 2976 Crash: New Footage and Transcripts Released

NTSB removes UPS Flight 2976 Spectrogram

AI Was Used to Recreate the Voices of Dead Pilots. The NTSB Responded by Locking Down Its Database.

Slow Drip

The Grind

The Mill

The Counter

Roast Calendar

Last Sip