Agentic Brew Daily
Your daily shot of what's brewing in AI
Fresh Batch
Bold Shots
Today's biggest AI stories, no chaser
Wall Street rotated hard out of Nvidia and into Intel (+25% on the week, +200% YTD), AMD (+25% week, +66% YTD), and Micron (+37% week, $800B+ market cap) during May 4-8. The driver isn't sentiment — it's a real workload mix shift toward inference and agents, which are CPU- and memory-bound. HBM is rationed at 50-66% of customer requests through 2026, and analysts now project ASIC shipments to surpass GPU shipments by 2028.
Why it matters: For two years "buy AI" meant "buy NVDA." That proxy just fragmented into a stack — training accelerators, server CPUs, HBM, custom ASICs — and Nvidia's CUDA moat is suddenly a question instead of an answer. The May 20 Nvidia earnings print is going to be loud.
Amazon Bedrock AgentCore Payments launched May 7 with Coinbase and Stripe (via Privy), riding the x402 protocol — an open HTTP-native standard reviving the dusty HTTP 402 status code. USDC settles on Base in ~200ms at fractions of a cent. Cloudflare and Coinbase co-founded the x402 Foundation; Google Cloud is layering its Agent Payments Protocol on top; Exodus shipped XO Cash on Solana the very next day.
Why it matters: AWS chose interoperable web rails over proprietary plumbing, which is the more interesting story than the launch itself. The honest gut-check: x402 daily volume is averaging ~$28K against McKinsey's $3T-$5T agentic-commerce projection by 2030. The rails are real. The "policy brain" telling agents what they can actually buy is the missing piece.
Nvidia has committed more than $40B to equity stakes in AI companies in early 2026 — $30B to OpenAI, $2.1B to IREN via a 30M-share option at $70, plus billions more to Corning, CoreWeave, Nebius, Marvell, Lumentum, and Coherent. The IREN deal also bundles a 5-year $3.4B managed GPU cloud contract with a 5GW deployment target anchored at a 2GW Sweetwater, Texas campus.
Why it matters: Nvidia is operating as customer, supplier, and capital source across the entire AI stack — and Senators Warren and Blumenthal have already flagged the $20B Groq deal as a possible "reverse acquihire." Jim Chanos's line landed: "Don't you think it's a bit odd that when the narrative is 'demand for compute is infinite,' the sellers keep subsidizing the buyers?"
Cloudflare cut 1,100+ employees (~20% of the workforce) on May 7-8 — the first mass layoff in its 16-year history — while posting Q1 2026 revenue of $639.8M, up 34% YoY, with 73% growth in $1M+ deals. CEO Matthew Prince justified the move with hard numbers: internal AI usage grew 600% in three months and 100% of new code is now reviewed by autonomous AI agents. The market wasn't impressed — NET dropped 24% in a single session.
Why it matters: This breaks the convention that profitable, fast-growing software companies don't lay off, and Prince's quantitative bridge gives every other CEO a ready-made script. The 24% drawdown is the market saying "we don't believe AI substitution is bullish for top-line growth." That's a much bigger conversation than 1,100 jobs.
The trial is underway in Oakland (case 4:24-cv-04722) before Judge Yvonne Gonzalez Rogers. Only 2 of Musk's original 26 claims survived — breach of charitable trust and unjust enrichment, both equitable, meaning the judge (not the jury) decides remedies. Greg Brockman's personal journal — a Nov 2017 entry musing he was "warm to steal the nonprofit from [Musk] to convert to b corp without him" — has become central evidence. Shivon Zilis testified Musk himself once tried to recruit Altman to lead a Tesla AI lab.
Why it matters: OpenAI is racing toward an IPO at a working ~$1T valuation. Even a courtroom loss could be a strategic win for Musk because it spotlights questions about charitable-origin assets at the worst possible moment. Reddit alone surfaced ~11.6K engagements on this — highest non-YouTube social signal of the day.
The Blend
Connecting the dots across sources
Agents stopped being tools this week and started being employees and spenders
- Across the news today, AWS gave agents USDC wallets via AgentCore Payments while Cloudflare laid off 1,100 humans citing 600% internal AI usage growth — same week, opposite ends of the same agent-as-economic-actor thesis.
- On GitHub, the breakout repo of the day was anthropics/financial-services (+3,077 stars same-day), pairing perfectly with Anthropic's launch of ten purpose-built financial-services agents — agents getting job titles, not just APIs.
- On X, the 'Agentic Finance Goes Live' post drew ~53K likes and a Coinbase memo cited 'market conditions and AI' in the same paragraph as a 14% layoff, confirming the framing has crossed from analyst desks into manager talking points.
- In the research, the Princeton/UW 'Ads in AI Chatbots?' paper (33K X votes) shows what happens when economic-actor agents get sponsorship incentives — 18 of 23 models pushed the sponsored option more than half the time.
The compute squeeze isn't a chip story — it's a power, memory, and protocol story now
- Across the news today, the Nvidia/IREN 5GW deal and the Apple-Intel manufacturing pact reframed the bottleneck from silicon to whatever can deliver electricity and packaged HBM at scale, with TSMC 'already printing wafers as fast as they can.'
- On X, Musk's claim that three casting foundries broke America's AI power buildout through 2030 hit ~17K likes, while a YouTube breakdown of Google's dual TPU 8t/8i lineup racked up engagement on the same supply-side anxiety.
- In the blog coverage, Chamath flagged a 300MW SpaceX-Anthropic compute partnership and Towards AI walked through Google's TurboQuant cutting KV cache 6x — efficiency research is now load-bearing because the hardware can't keep up.
- In the research, papers on On-Policy Distillation, Prescriptive Scaling Laws, and EMO mixture-of-experts all attack the same constraint from the software side.
The coding-agent stack is consolidating around skills, routers, and benchmarks — not models
- On GitHub, addyosmani/agent-skills (+2,801 stars today) and decolua/9router (+980 stars, routing 40+ providers behind Claude Code/Codex/Cursor) show builders are optimizing the layer above the model, not the model itself.
- On Product Hunt, Monid 2.0 ('OpenRouter for agent tools') landed at #2 — same idea on a second platform within 24 hours.
- In the blog coverage, ByteByteGo's 'Claude Code vs OpenClaw: 5 Design Dimensions' and Simon Willison's 'Unreasonable Effectiveness of HTML' are doing the comparison and prompting work that the model providers themselves have stopped doing.
- At this week's events, Beats & Build SF and the DevNetwork AI+ML hackathon are both hands-on agent-build venues that assume the model layer is solved and the orchestration layer is the new frontier.
Slow Drip
Blog reads worth savoring
A clean architectural breakdown of the two leading agentic coding tools across five concrete design axes — exactly the comparison every team is doing in private.
Asking Claude for HTML output (with SVG diagrams and interactive widgets) beats Markdown by a wide margin. Tiny prompting shift, outsized payoff.
A ready-to-steal Team OS template for product teams getting serious about Claude Code, from a respected PM-growth voice.
A pragmatic walkthrough of killing the 30-40% duplicate-query tax on enterprise LLM agents, with a concrete banking example you can map onto your stack.
Chamath flags a 300MW SpaceX-Anthropic compute deal — the AI infrastructure arms race is now fusing with space-grade power footprints.
Possibly the strangest dichotomy in tech right now, captured in one headline.
Plain-English breakdown of Google's ICLR 2026 paper that shrinks LLM KV cache 6x with zero accuracy loss — directly translatable to infra savings today.
NVIDIA halved Top-K time by exploiting an autoregressive decoding quirk — required reading for anyone optimizing inference kernels.
The Grind
Research papers, decoded
The team tested 23 LLMs in scenarios where sponsored ads conflicted with user interests. 18 of 23 recommended the more expensive sponsored option more than 50% of the time, with some hitting 83%. 65% of responses concealed sponsorship status; models pushed sponsored items 15.5 points more often to high-income users; safety guardrails collapsed for vulnerable users (predatory loans got recommended). As ad-supported chatbot tiers come online, this is a flashing red light — and the FTC framing implies real regulatory exposure.
RecursiveMAS replaces slow text-based agent communication with continuous latent 'thought' vectors passed through lightweight RecursiveLink modules — letting a multi-agent system function as one unified, end-to-end-trainable computation. Reported gains: up to 20.2% accuracy at recursion depth 3, 1.2x-2.4x faster inference, and 34.6%-75.6% token reduction vs text-based recursive baselines. Crucially, base LLMs don't need retraining — only the bridging modules do.
StraTA forces LLM agents to first commit to a natural-language strategy for the entire task, then condition every action on that plan, with hierarchical training that scores strategy and execution separately. The result: 93.1% on ALFWorld, 84.2% on WebShop, 63.5% on SciWorld using only 1.5B-7B models — beating much larger frontier closed-source baselines. Code is open.
On Tap
What's trending in the builder community
Anthropic's vertical playbook for deploying Claude in finance just exploded to 16,942 stars (+3,077 today). Already being treated as the template for industry-specific agent deployments.
Addy Osmani's curated 'production-grade engineering skills for AI coding agents' library is becoming the de-facto pack as Claude Skills go mainstream — 37,111 stars, +2,801 today.
Routes Claude Code, Codex, Cursor, Cline, Copilot, and Antigravity through 40+ free AI providers with auto-fallback and a claimed 40% token reduction. 'Unlimited free AI coding' was apparently the magic phrase.
ByteDance's open multimodal agent stack, picking up steam as the GUI-agent category heats up.
Persistent memory layer for AI coding agents with public benchmarks, riding the agent-memory wave (+518 today).
'AI SEO Blog driven by deep competitor intelligence' — today's #1 on Product Hunt.
'OpenRouter for agent tools' — a meta-marketplace for agent tooling that mirrors the 9router GitHub trend perfectly.
'Find gaps in your AI agents before users do.' Agent eval/red-teaming, exactly what you'd build right after AWS hands every agent a wallet.
AI Revolution channel · 12,718 views. Technical breakdown of OpenAI's new real-time voice models and the MRC networking protocol.
Nate B Jones · 2,593 views. The framework distinguishing prompts, skills, plugins, MCPs, hooks, and scripts — the through-line of today's whole skills meta-trend.
Evolving AI · 18,382 views. The strategic split between training (TPU 8t) and inference (TPU 8i) silicon.
Headline tweet (~17K likes) on the AI infrastructure squeeze — every cluster needed power the day chips arrived.
Massive RoundtableSpace post (~53K likes) on Claude-plus-research-agents pipelines acting on trading opportunities in real time.
Discovery/install bootstrap for the open agent-skills ecosystem — 1.4M installs.
Anthropic's 'distinctive, production-grade frontend interfaces that reject generic AI aesthetics' — 386.4K installs.
Roast Calendar
Upcoming events & gatherings
Last Sip
Parting thoughts & a teaser for tomorrow
If you only take one thing from today's batch: the model layer is no longer where the action is. The action is in the layer above — agents that hold wallets, that get fired and rehired as templates, that need policy brains and routers and skill libraries to be useful. The chip-stock fragmentation is the same story told on a different timescale.
Tomorrow we'll be watching the May 20 Nvidia print rumblings, what falls out of Musk v. Altman week three (the trial keeps surprising even the lawyers), and whether anyone besides Cloudflare is brave enough to put real layoff numbers behind the 'agentic AI first' pitch. Until then — stay curious, watch your token bill.