Agentic Brew Daily
Your daily shot of what's brewing in AI
Fresh Batch
Bold Shots
Today's biggest AI stories, no chaser
Cerebras priced its Nasdaq debut at $185 a share — above the $150-$160 marketed range — and sold 30M Class A shares for $5.55B. The ticker CBRS started trading May 14 at a fully diluted valuation of about $56.4B, more than double the $26.6B implied just five weeks earlier. Demand exceeded shares by more than 20x, making it the largest US tech IPO since Snowflake's $3.8B debut in 2020. The whole pitch is anchored on a $20B+ master relationship agreement with OpenAI for 750 MW of inference capacity, scaling toward 2 GW by 2030.
Why it matters: This is the opening salvo of an AI IPO wave (OpenAI and SpaceX are rumored next). If CBRS trades cleanly, the cost of capital drops across AI infra; if it breaks issue, the pipeline thins out fast. The wafer-scale chip story is a real shot at "end of GPU homogeneity" in inference — but P/S north of 110x means the market is paying tomorrow's price today.
Ramp's April 2026 AI Index put Anthropic at 34.4% of paying businesses versus OpenAI's 32.3% — the first crossover ever. Anthropic gained 3.8 points month-over-month while OpenAI lost 2.9, and Anthropic has quadrupled in a year while OpenAI added 0.3 points. OpenAI's chief application officer Fidji Simo told staff the company is in "code red" mode. Claude Code is the wedge — Anthropic's fastest-growing product ever — and a freshly expanded PwC alliance will certify 30,000 professionals on Claude.
Why it matters: This is a year-long slope, not a one-month blip. Claude Code is the Trojan horse into finance (Citadel, BNY, FIS, Mizuho), legal (Freshfields, Quinn Emanuel, Holland & Knight), and consulting (PwC). The enterprise AI competitive map has been quietly re-rated.
The Musk-vs-Altman trial finished its liability phase in Oakland after 11 days. Musk is seeking $134B in damages, the removal of Sam Altman and Greg Brockman, and an unwind of OpenAI's for-profit conversion. The witness list reads like an AI history podcast — Altman, Brockman, Sutskever, Murati, Helen Toner, Shivon Zilis, Joshua Achiam, and Satya Nadella. Sutskever, Murati, and Toner all testified about a "consistent pattern of lying" memo that preceded Altman's brief 2023 ouster. Judge Yvonne Gonzalez Rogers retains discretion to overturn the jury verdict.
Why it matters: If the charitable-trust framing sticks, every mission-driven nonprofit-turned-for-profit suddenly has legal exposure. Nadella's testimony also confirmed Microsoft is rewiring its AI dependence toward multi-vendor independence — Azure has been hosting xAI since 2024. Sworn allegations of dishonesty are an obvious overhang on any near-term OpenAI IPO.
Cisco reported Q3 FY2026 revenue of $15.84B (+12% YoY) and non-GAAP EPS of $1.06. The headline: it doubled its full-year AI infrastructure order forecast from $5B to $9B, with $5.3B already booked YTD and networking product orders up over 50% YoY. The company is cutting fewer than 4,000 jobs (<5% of workforce) with up to $1B in pre-tax restructuring charges, and the stock surged ~14% — its best single-day move in more than 20 years.
Why it matters: Investors rewarded the layoffs because the math told a coherent silicon-thesis story: every freed-up dollar is being redeployed into the bucket whose order book just doubled. Cisco's Silicon One inside Nvidia Spectrum-X is the wedge into hyperscale AI Ethernet long held by Broadcom and Arista. As CEO Chuck Robbins put it: "If you don't have silicon you're going to struggle to be relevant to the hyperscalers."
Notion launched its Developer Platform on May 13: Workers (a hosted runtime on Vercel Sandbox for deploying custom code with zero server provisioning), Database Sync in beta, an External Agents API in alpha that lets Claude Code, Cursor, Codex, and Decagon operate inside Notion as native workspace participants, and a new CLI called ntn. Workers are free during public beta, with credit-based billing kicking in August 11. Customers have already built more than a million agents in the three months since Custom Agents shipped in February.
Why it matters: This reframes the workspace as the AI agent orchestration layer and collapses the Zapier/Pipedream middleware tax into one sandbox deploy. It also lowers switching costs across coding agents while raising the stakes for glue vendors. If Notion executes, it goes head-to-head with Microsoft Power Platform.
Meta launched Incognito Chat for WhatsApp and the standalone Meta AI app on May 13, calling it "the first major AI product where there is no log of conversations stored on servers." Inference runs inside a Trusted Execution Environment built on AMD SEV-SNP confidential VMs and NVIDIA H100s in confidential computing mode, fronted by Oblivious HTTP relays with remote attestation. Conversations vanish when the app closes or the phone locks. Text-only at launch, with voice, image, and a branching "Side Chat" feature signaled for later.
Why it matters: This shifts AI privacy from retention policy to hardware guarantee — a structural threat to OpenAI, Gemini, and Claude's policy-based "temporary chat." It also lands during OpenAI lawsuits where preserved logs are being used as evidence. The cheapest log to defend is the one that never existed.
The Blend
Connecting the dots across sources
The coding agent is now the wedge for every other enterprise AI decision
- Anthropic's first-ever crossover past OpenAI in paid business share lines up exactly with Claude Code's rise as the company's fastest-growing product, and walks straight into finance, legal, and consulting wins like the PwC deal to certify 30,000 professionals.
- OpenAI's response on the same day — Codex shipping to the ChatGPT mobile app, Hooks for programmatic customization, and two months free for switchers — reads like a defensive playbook against exactly this dynamic.
- On GitHub, the day's trending lists are dominated by Claude Code skill repos, and on Product Hunt the second-place launch is literally an observability tool for Claude Code token spend.
- Princeton's tau-bench research, going viral on X with 34K+ votes, shows even GPT-4o passes only about 61% of single-try real-world tasks and drops below 25% across eight attempts — a sobering counter-narrative to all this coding-agent enthusiasm.
Inference economics, not model quality, are the binding constraint now
- Cerebras priced 2026's largest tech IPO almost entirely on a single $20B+ OpenAI contract for 750 MW of inference capacity, with investors paying around 110x sales for the wafer-scale bet.
- Cisco doubled its AI infrastructure order forecast to $9B and saw its biggest single-day stock move in more than 20 years, with the CEO openly saying companies without silicon are about to be irrelevant to hyperscalers.
- Anthropic's quiet move to meter programmatic Claude usage and 3x its image prompt pricing — the move developers are loudly complaining about — is the same token-economics squeeze expressed from the model-vendor side.
- A trending Semianalysis deep dive on Cerebras's WSE-3 architecture and a Google Cloud paper on proxy models that cut LLM-powered SQL cost 100x are both reading the same room: speed and cost-per-token are now the products.
AI trust is moving from policy promises to hardware attestation
- Meta's Incognito Chat on WhatsApp claims zero server-side logs by running inference inside AMD confidential VMs and H100s in confidential computing mode — a hardware guarantee, not a privacy policy.
- The launch lands directly into the Musk vs. Altman backdrop, where preserved ChatGPT logs are being used as courtroom evidence and Microsoft's CEO testified about needing real agency at every layer of the stack.
- An Indie Hackers post about red-teaming tool PromptBrake going on-prem is the SMB version of the same story: keep prompts inside customer infrastructure.
- Princeton's tau-bench shows 25% of agent failures are policy violations — empirical evidence that policy-based trust is no longer enough.
Slow Drip
Blog reads worth savoring
A meticulously sourced deep dive into Cerebras's $24.6B OpenAI deal, WSE-3 architecture tradeoffs, and the economics of speed-optimized inference. Essential if you want to actually understand what investors paid for today.
A sharp a16z thesis on how reasoning layers are eating CRMs and reshaping where enterprise software value accrues. Required framing if you're building or buying agentic tools.
A practical shortlist of open-weight SLMs (SmolLM3, Qwen3-4B, Phi-3-mini, Gemma-4, Mistral-7B) that actually do structured tool calls — perfect for engineers shipping agents on edge or budget hardware.
A real production blueprint — three-tier memory, debounce batching, confidence scoring, crash-safe writes — for builders tired of agents that forget everything between sessions.
The clearest single read on Anthropic's metered-credit 'rug pull,' OpenAI's two-months-free Codex push, and why developer loyalty is suddenly up for grabs.
Google's SIGMOD paper on proxy models cuts LLM-powered SQL cost and latency by 100x and it's already live in BigQuery and AlloyDB — a glimpse at how semantic analytics actually becomes affordable.
A CVPR 2026 Highlight showing that an 8B model with expert-curated cinematic captions beats GPT-5 at video generation control — a vivid case study in data quality beating scale.
The Grind
Research papers, decoded
τ-bench stress-tests agents in multi-turn conversations with simulated users, real APIs, and strict policy documents across retail and airline domains. The kicker: even GPT-4o passes only about 61% of retail tasks on a single try, and drops below 25% when you require success across 8 independent attempts (a new 'pass^k' reliability metric). Failures are dominated by wrong API arguments (33%) and policy-violating decisions (25%) — exactly the things single-run benchmarks hide.
ELF brings continuous-space diffusion to language modeling by running Flow Matching almost entirely in embedding space and only snapping to discrete tokens at the very last step. Competitive quality with just 45B training tokens vs 500B+ for comparable models, classifier-free guidance out of the box, and beats leading discrete diffusion models (MDLM, Duo) with fewer sampling steps. A credible path to faster, more controllable non-autoregressive text generation.
FeatCal tackles model merging's known pain — the merged checkpoint trailing the individual task experts — by tracing the gap to 'feature drift' and fixing it with a tiny calibration set applied layer-by-layer via a closed-form (no gradient descent) update. Beats Surgery and ProbSurgery on CLIP-ViT-B/32 (85.5% vs 77-78%) and FLAN-T5-base GLUE, runs about 4x faster, and reaches strong accuracy with as few as 8 examples per task.
On Tap
What's trending in the builder community
A privacy-first local AI super-assistant pitched as 'your personal AI super intelligence' — written in Rust and gained 3,476 stars today.
Matt Pocock open-sourced his entire .claude directory of skills and engineers are eating it up — 2,971 new stars today.
'#1 Persistent memory for AI coding agents based on real-world benchmarks' — the agent memory problem keeps generating top-tier tools.
An agentic skills framework and software development methodology that's quietly racked up 190K+ stars total.
Turns commodity WiFi signals into real-time spatial intelligence and vital-sign monitoring — no camera required.
An AI wearable that remembers your conversations all day — memory keeps showing up as the universal pain point.
See where Claude Code burns tokens and hit your limits less — Anthropic's new metering already has a thriving observability layer.
Grow your own software that's 'alive' — UI that mutates and reshapes itself as you use it.
AI agents autonomously fine-tuning vision-language models with full-weight access, inference routing, benchmark filtering, MCPs, and a live training demo.
MAMMAL, a foundation biology model that outperforms AlphaFold 3 on toxicity, antibody design, and cancer drug development.
The bandwidth, networking, optics, and power thesis behind today's Cisco pop, explained in 20 minutes.
OpenAI's announcement that the Codex coding agent is now in preview inside the ChatGPT mobile app, with desktop/devbox continuity.
An agentic CLI for coding, building apps, and automating workflows is now available for SuperGrok Heavy subscribers at x.ai/cli.
Captures learnings, errors, and corrections to enable continuous improvement — the most-installed skill on Clawhub today.
Security-first skill vetting for AI agents — the supply-chain layer for skills is starting to form.
Roast Calendar
Upcoming events & gatherings
Freestyle.sh-hosted dev night on wrangling AI-generated code in production with Ben Werner and James Tan — useful if you ship Claude or Codex output to real users.
SupportVectors' applied-AI reading group on closing the requirements-to-code loop — strong companion to today's τ-bench discussion.
Eight piloted humanoid fight matches plus robot dance battles at Temple Nightclub — pure spectacle, also a recruiting moment for Nebius.
Intimate AI research meetup with Sophia Han and Calvin Chen, hosted by GC-backed Proximal — good for serious researchers.
Frontier Tower's Human+Tech Week session with Chelsea Borruano on the human side of strategy work — useful counterweight if your week was all silicon and IPOs.
Multi-day forum on the Valley's economic and tech future hosted by AW3 Technology — 195+ interested attendees and the right audience for today's macro AI threads.
Chief of Staff Network's flagship SF gathering for operators and CoSes at fast-moving AI companies — solid networking if you operate behind a high-velocity founder.
Last Sip
Parting thoughts & a teaser for tomorrow
If today felt like the day enterprise AI's pecking order got reshuffled, that's because it kind of was. Anthropic is in the lead on paid business share, OpenAI is in defensive mode with mobile Codex and discount switching, Cerebras just printed the biggest tech IPO since Snowflake, Cisco is making its silicon thesis stick, Notion put the agent orchestration layer inside its workspace, and Meta moved AI privacy from a policy promise to a hardware guarantee. Underneath all of it, τ-bench keeps quietly insisting that the agents driving all this enthusiasm still fail more than half the time when you make them try real tasks.
Tomorrow we'll watch CBRS's first full trading day for the real verdict on the IPO appetite, see whether Anthropic's metered programmatic pricing turns the developer grumbles into a measurable migration, and check whether Notion's External Agents API starts showing up in third-party benchmarks. See you in the next brew.