Jun 9, 2026

Agentic Brew Daily

Your daily shot of what's brewing in AI

Fresh Batch

Distilled trend

OpenAI and Anthropic are racing to file IPOs at trillion-dollar marks the same week a 250-expert benchmark showed agents fully pass just 2.6% of real economic tasks.
Compute is fanning out at once: SpaceX's orbital AI1 satellite, NVIDIA's South Korea buildout, and Google's 3M-TPU order to Intel loosen the single-datacenter chokehold.
Verification, not raw capability, is the agent bottleneck builders are funding now: developer threads about unverifiable agent code and SF's run of agent-security and formal-methods meetups point the same direction.

Bold Shots

Today's biggest AI stories, no chaser

OpenAI submitted a confidential draft S-1 to the SEC on Monday, June 8 — its first formal step toward a public offering — landing one week after Anthropic's own confidential filing at a roughly $965B valuation, higher than OpenAI's last private mark of about $852B. A draft S-1 forces OpenAI to disclose financials it raised on narrative alone, and under FASB Topic 730 frontier-training costs must be expensed as incurred, reframing the spend from "building infrastructure" to "burning cash now." The numbers: ~$9B net loss on $13.1B revenue (2025), a projected ~$14B loss in 2026, no profitability forecast before ~2030, and a reported ~$207B funding gap by 2030. Filing second and lower than Anthropic inverts the pecking order and turns a research rivalry into a Wall Street footrace.

Why it matters: Public paperwork forces the first real look at OpenAI's economics, and it drops OpenAI into a direct, lower-valuation footrace with Anthropic for the same investor dollars.

Breaking: OpenAI filed for an IPO, setting it up to potentially go public as soon as this fall. Exclusive | OpenAI Kicks Off IPO Process in Test of Investor Appetite for Top AI Labs

@WSJ·100 engagements

BREAKING: WSJ reports OpenAI just made its first formal move toward IPO. It has confidentially filed draft paperwork for an IPO. A confidential S-1 lets OpenAI start SEC review without immediately exposing financials.

@rohanpaul_ai

NVIDIA AI infrastructure partnerships across South Korea

During Jensen Huang's Seoul visit, NVIDIA announced AI-infrastructure and physical-AI partnerships spanning NAVER, SK Group, LG, Doosan, and Hyundai, plus a national GPU-operator selection. NAVER is building a full-stack AI factory on DSX, expanding GAK Sejong to 55MW toward gigawatt scale; SK signed a multi-year next-gen-memory deal across four NVIDIA platforms; LG builds humanoid robots on Isaac GR00T. The deals form a coordinated national division of labor — memory (SK) to compute/models (NAVER) to robots/power (LG, Doosan, Hyundai) — and because SK hynix and Samsung dominate HBM, it is mutual lock-in: Korea needs the GPUs, NVIDIA needs the HBM. Energy and thermal capacity, not GPU counts, emerge as the real bottleneck.

Why it matters: It turns a whole country into a vertically integrated AI supply chain and makes the NVIDIA-Korea tie mutual lock-in, with power and cooling, not GPU counts, as the binding constraint.

OpenAI's ChatGPT 'superapp' overhaul ahead of IPO

OpenAI is planning its largest ChatGPT overhaul since launch — a unified superapp merging Codex, AI agents, image generation, the Atlas browser, and third-party partner apps, rolling out first on web and mobile in the coming weeks. A senior employee told the Financial Times that "chat is dead," signaling a pivot from Q&A toward autonomous multi-step agents. The redesign reads as reverse-engineered from IPO economics: ~2M business customers already contribute ~40% of revenue (projected ~50% by year-end), and Codex weekly active users grew ~6x to 5M+. OpenAI is becoming an enterprise software company that happens to own the world's largest consumer AI app.

Why it matters: If chat really is dead, OpenAI is repositioning as enterprise agent software right before an IPO, where recurring business revenue is worth far more than consumer chat traffic.

NotebookLM upgraded to Gemini 3.5 with agentic research

Google upgraded NotebookLM to run on Gemini 3.5 and Antigravity, giving each notebook a secure cloud computer that can write and run code for agentic, multi-step research. Users can start from loose ideas and let NotebookLM use Google Search to discover and add high-quality sources instead of uploading documents first, then generate outputs in PDF, DOCX, Markdown, XLSX, PPTX, charts, images, and CSV/JSON. It rolls out globally on web starting June 8 to AI Ultra subscribers and Workspace business customers. The upgrade repositions NotebookLM from a passive document reader into a unified research workspace — though critics caution that web-discovery results remain hit or miss.

Why it matters: Google is pushing agentic research, code execution and live source discovery, straight into everyday Workspace workflows, raising the bar for what a research tool is expected to do.

SpaceX orbital AI compute and the AI1 satellite

SpaceX unveiled AI1, its first-generation AI satellite — a 150kW peak (120kW average) compute payload with a 70-meter deployed wingspan, roughly the compute of one GB300-class server rack in orbit. A May 20 S-1 recast SpaceX as a vertically integrated AI infrastructure company trading under ticker SPCX, targeting orbital compute as early as 2028; it already rents terrestrial GPU capacity at scale (~$1.25B/month from Anthropic, ~$920M/month from Google). Musk announced TeraFab, a ~$20B fab for space-hardened D3 chips, and filed with the FCC for up to one million orbital data-center satellites. Skeptics question why a flop should be worth more 250 miles up; Musk dismisses the heat-rejection worry as "a bizarre debate about radiators in space."

Why it matters: SpaceX is staking a record-setting IPO on owning the entire AI stack end to end, from chip fab to orbit, a thesis skeptics argue the physics and economics do not yet support.

SpaceX CEO Elon Musk unveiled a more detailed look at an initial version of an AI data center satellite that the company plans to build, providing more insight into the project driving SpaceX's highly anticipated IPO.

@business·26 engagements

Slow Drip

Blog reads worth savoring

Analysis · Alibaba CloudTokenmaxxing Dilemma: Are There Immediate Solutions for Improvement?

Shows that input (not output) tokens dominate agent costs and that ontology-based knowledge graphs cut token usage ~90% across 31 repos, a concrete lever for slashing agent bills.

Tutorial · KDnuggetsAnthropic's Complete Guide to Claude Skills Building

Walks through the exact Skill file structure, naming rules, and reliability patterns so you can ship a working, distributable Claude Skill end-to-end.

Research · Hugging FaceThe crash that vanished: control and emergence in a five-model economy

Demonstrates that emergent multi-agent behavior is fragile across model populations and that reliable outcomes require authoring deterministic events at post-decision seams, a hard-won lesson for anyone designing agent simulations.

News · simonwillison.netdatasette-agent-edit 0.1a0

Distills the Claude text-editor tool pattern (view / str_replace / insert) into a reusable design you can copy for any agentic text-editing feature.

The Grind

Research papers, decoded

Sequence Modeling / Architecture6,703 upvotes · arxiv · X

Memory Caching: RNNs with Growing Memory

Memory Caching lets recurrent models grow their effective memory with sequence length by caching checkpoints of hidden states at regular intervals, interpolating between the O(L) cost of RNNs and the O(L^2) cost of Transformers. Bolting MC onto an RNN like Titans yields perfect needle-in-haystack retrieval at 4K/8K context and closes much of the recall gap to Transformers while staying far cheaper. If your recurrent backbone loses on recall-heavy long context, MC is a drop-in module that buys Transformer-like recall at near-linear cost without retraining.

World Models / Physical AI163 upvotes · alphaxiv

Cosmos 3: Omnimodal World Models for Physical AI

A single Mixture-of-Transformers model jointly handling language, image, video, audio, and action, subsuming VLMs, video generators, world simulators, and robot policies into one backbone via a dual-tower Reasoner/Generator design with 3D Multimodal RoPE. Post-trained variants ranked #1 open-weight Text-to-Image and Image-to-Video on Artificial Analysis, topped Physics-IQ, and set RoboArena manipulation records. Weights, checkpoints, datasets, and the eval ship under an OpenMDW license, so robotics teams get a genuinely open SOTA foundation to fine-tune.

Agent Evaluation104 upvotes · alphaxiv

Agents' Last Exam (ALE)

A living benchmark built with 250+ industry experts to test agents on long-horizon, economically valuable real-world tasks — 1,490 task instances across 55 professional subfields grounded in the U.S. O*NET/SOC taxonomy. The hardest tier sits at just 2.6% average pass rate, the strongest config (GPT-5.5) hits only 26.2% overall, and 47% of failures are wrong strategy, 31% understanding errors. Backbone model choice swings results ~18 points versus only 5-6 for the harness, so invest in reasoning capacity over tooling.

The Mill

Builder tools ground for action

34K stars

CopilotKit/CopilotKit

The Frontend Stack for Agents & Generative UI. React, Angular, Mobile, Slack, and more. Makers of the AG-UI Protocol

GitHub

47.9K stars

aaif-goose/goose

an open source, extensible AI agent that goes beyond code suggestions - install, execute, edit, and test with any LLM

GitHub

The Counter

Voices from the AI bar today

19K views

Every AI Assistant Has the Same Flaw, and It Can't Be Fixed

Technical deep dive on prompt injection as a structural, unfixable vulnerability — the "lethal trifecta" (private-data access + untrusted input + outbound channel) exploited across OpenAI, Anthropic, Microsoft, Google.

Addie LaMarr

22K views

Harness Engineering Is AI's New Gold Rush

Frames harness engineering — the tools, memory, permissions, and feedback loops around a model — as the discipline that determines agent reliability beyond raw model quality.

AI Revolution

17K views

You Own the Server Now - Abacus AI SuperComputer Review

Hands-on review of a platform giving AI agents persistent Linux environments, databases, and public deployment so they can build and host full apps autonomously.

Shark Numbers

22K engagements

Google just shipped a free dictation app for Mac and iPhone called AI Edge Eloquent... the model running it is Gemma 4 12B, entirely on your device.

@adityarao310

10K engagements

Intel's pre-market gains have expanded to 10%... Google placed an order with Intel for over 3 million TPU chips.

@fxtrader

3.2K upvotes · 320 comments

I had Opus 4.8 build Temu League of Legends in under a day - I call it LMAO

A maker shows off a full League-of-Legends-style game built in under a day with Opus 4.8 — a viral showcase of coding-agent throughput.

r/ClaudeAI

1K upvotes · 323 comments

google/gemma-4-12B · Hugging Face

The Gemma 4 12B release lands on Hugging Face; the local-LLM crowd dissects weights, licensing, and on-device performance.

r/LocalLLaMA

Roast Calendar

Your AI week, day by day

Tue9

6:00 PM PT•Redwood City

Excellence in Tech: AI Agents ft. Gabor Cselle (leads AI at Google Workspace, ex-OpenAI)

6:00 PM PT•San Francisco

GTM Eng SF: Claude Code + GTM Lightning Talks (Exa, LangChain, Composio, AssemblyAI)

4:30 PM PT•San Francisco

[Fireside] From Tokens to Robots

Wed10

9:30 AM PT•San Francisco

AI Inference Hack Day

2:00 PM PT•San Francisco

Humans in the Loop 2026

10:00 AM PT•San Francisco

AI Breakfast @ Tesla hosted by Founders Bay x You.com

Thu11

9:30 AM PT•San Francisco

ClickHouse + Hex AI Hackathon

8:30 AM PT•San Francisco

Vector Space Day San Francisco

6:00 PM PT•San Francisco

Artificial Analysis Coding Agent Benchmarks

Fri12

9:30 AM PT•San Francisco

Harness Engineering Hack

6:00 PM PT•Palo Alto

Agentic GTM: How a16z, Khosla & HF0 Builders Automate Customer Acquisition

5:00 PM PT•Mountain View

Gemini Meetup

Sat13

10:00 AM PT•San Francisco

Autonomous Healthcare Hackathon | xAI · Cursor · Vercel (Legion Health x Atlas)

All weekend•Santa Cruz

AINative Startup Camp — Santa Cruz

1:00 PM PT•San Francisco

Smoothies & Strategies on AI Marketing & Growth

Sun14

9:00 AM PT•San Francisco

BuilderShip — Yacht Hackathon by Composio, Nebius, Tavily

4:00 PM PT•Mountain View

Pick Anything Challenge #1 · Bothaus Demo Day

9:30 AM PT•Burlingame

AI Founders Morning at Top Golf

Mon15

5:00 PM PT•Hackathon

Databricks Apps & Agents for Good Hackathon 2026

6:00 PM PT•San Francisco

Build with Claude AI. Lead Without Burning Out.

6:00 PM PT•Sunnyvale

Real-Time Lakehouse & Agentic AI

Last Sip

Parting thoughts

The labs are filing for the public markets the same week the independent benchmarks say deployed agents clear 2.6% of real economically valuable work. The gap between the pitch deck and the pass rate is the whole story right now — and the builders quietly funding verification, not capability, may be reading the room more clearly than the bankers.