Jun 25, 2026

Agentic Brew Daily

Your daily shot of what's brewing in AI

Fresh Batch

Distilled trend
  • The same memory shortage powering Micron's 346% revenue jump justifies SpaceX leasing out $6.3B of compute and China rushing to break the chokehold.
  • Vendors are integrating both ends of the stack at once — OpenAI taped out its own Jalapeño chip while Qualcomm paid $3.92B for Modular's software layer.
  • Open weights have closed the gap on price, not just benchmarks — GLM 5.2 did a full coding day for $3.36 and Qwen-AgentWorld beat frontier models.

Bold Shots

Today's biggest AI stories, no chaser

On June 23, Anthropic launched Claude Tag, a collaborative tool that lets a team work with Claude inside Slack by tagging @Claude. Unlike the old private 1:1 assistant, it's multiplayer: one Claude per channel that everyone can see and steer, with access to selected channels, tools, and data, running on Claude Opus 4.8. Assigned work is broken into stages and carried out asynchronously in a thread, and it accumulates memory over time — plus an "ambient" mode that acts without being tagged. It's in beta for Claude Enterprise and Team under an admin-provisioned org identity.

Why it matters: Claude Tag inverts the private 1:1 Slack assistant into a shared coordination surface with its own identity and persistent memory. It's a land-grab for the enterprise "context layer," going up against Salesforce's Slackbot, Microsoft, Glean, Databricks, and Snowflake.

SpaceX signed a compute deal worth up to $6.3B with open-source lab Reflection AI ($150M/month, July 2026-2029) for Nvidia GB300 chips at Colossus 2 near Memphis. That's on top of Anthropic leasing the full ~300 MW of Colossus 1 at ~$1.25B/month through May 2029, and Google committing $920M/month for ~110,000 GPUs. Musk also confirmed an orbital AI constellation, Starmind — up to a million solar-powered satellites, with two AI1 prototypes slated for early 2027.

Why it matters: SpaceX has positioned itself as neutral AI infrastructure that even its fiercest model-layer rivals (Anthropic, Google) now depend on — wrapped in a circular Nvidia financing loop. The leasing business now rivals SpaceX's launch revenue.

On June 24, OpenAI and Broadcom unveiled Jalapeño, OpenAI's first "Intelligence Processor" — a custom accelerator for LLM inference. It went from initial design to tape-out in nine months, possibly the fastest ASIC cycle ever, with OpenAI's own models helping accelerate the design. Engineering samples are already running ML workloads including GPT-5.3-Codex-Spark, with initial deployment targeted for the end of 2026 at gigawatt scale.

Why it matters: Jalapeño targets inference — the permanent, compounding cost of every ChatGPT, Codex, and API call. It's leverage rather than a divorce from Nvidia (still needed for training). Worth noting the headline performance numbers are all first-party for now.

Micron reported record fiscal Q3 2026 revenue of $41.46B, up 74% sequentially and 346% YoY, with adjusted EPS of $25.11 against ~$21 expected. Its company-record 84.9% gross margin edged past both Nvidia (~75%) and Meta (~82%). Micron's entire 2026 HBM supply is sold out under multi-year contracts — 16 strategic customer agreements, 14 of them carrying ~$100B in cumulative minimum-price revenue. It guided Q4 to ~$50B revenue at ~86% margin, and shares rose ~13% after hours.

Why it matters: A historically cyclical commodity maker briefly became the highest-margin name in big tech, out-earning Nvidia and Meta — because HBM is now the gating part of the AI stack. Read it as a live test of whether AI demand is structural or a bubble.

Meta paused its Model Capability Initiative (MCI), an internal program launched in April 2026 that logged U.S. employees' keystrokes, mouse movements, clicks, and screen content to gather AI-agent training data. A security misconfiguration exposed the collected data company-wide, including full prompts, transcriptions, and private conversations. The flaw was found June 18; a fix deployed within four hours failed, and an internal notice went to U.S. employees June 22. More than 1,600 employees signed a petition opposing MCI.

Why it matters: Meta logged how its own engineers use computers to build agents that do those same tasks, backstopped by a ~$140B 2026 AI spend. The leak (~45,000 internal tables reportedly exposed) turned an internal surveillance experiment into a lesson in how not to collect training data.

Slow Drip

Blog reads worth savoring

Analysis · ByteByteGoAn Ex-Meta L8's Agentic Engineering Setup

Steal a principal engineer's full agentic toolkit: voice-driven prompting, overnight autonomous task runners, and a validation pipeline that catches bugs in 68% of changes.

Analysis · SemiAnalysisChina's CXMT Is Set to Challenge DRAM Incumbents

How a state-backed, Qimonda-seeded memory maker hit $8.6B revenue and 70% margins, and why it's still betting on commodity DRAM over HBM.

Tutorial · Amazon EngineeringBuild a protein research copilot with Amazon Bedrock AgentCore

A working multi-tool agent pattern (NL parser + ESM-C 300M embeddings + pgvector search + LLM summarizer) you can replicate for any domain-specific RAG copilot.

News · Cloudflare BlogUnlocking the Cloudflare app ecosystem with OAuth for all

A real post-mortem of a zero-downtime OAuth engine migration that cut API latency 45%.

The Grind

Research papers, decoded

Industry / Multi-Agent Systems45,295 upvotes · alphaxiv · X
Sakana Fugu — Multi-Agent System as a Model

Sakana's Fugu packages an entire multi-agent system behind a single OpenAI-compatible API endpoint, so a coordinated team of specialized agents is callable as if it were one model. Strongest community signal by a wide margin, though it's a vendor announcement, not peer-reviewed.

Agents / Reinforcement Learning124 upvotes · arxiv
Tmax: A Simple Recipe for Terminal Agents

A fully open recipe (dataset, code, checkpoints) for turning small open-weight models into terminal agents via RL. TMAX-9B hits 27.2% on Terminal-Bench 2.0 — best open model under 10B params — and gains generalize to SWE-Bench Verified (44.0%->53.5%) and AIME (73.3%->91.1%). Weights on HF.

Document AI / Efficiency47 upvotes · alphaxiv
Unlimited OCR Works

Fixes the KV-cache blowup in long-document OCR via Reference Sliding Window Attention (R-SWA), holding KV cache constant regardless of output length. A 3B MoE model transcribes 40+ pages in a single 32K-token pass at 93.23% on OmniDocBench v1.5 (+6.22 pts) and 12.7% higher throughput. Code and weights released.

The Mill

Builder tools ground for action

16.9K stars

A format specification for describing a visual identity to coding agents. DESIGN.md gives agents a persistent, structured understanding of a design system.

GitHub
16.6K likesHF

Generate any application by Vibe Coding it DeepSite is a Vibe Coding Platform designed to make coding smarter and more efficient. Tailored for developers, data scientists, and AI engineers, it integrates generative AI into your coding projects to enhance creativity and productivity. DeepSite v4 is a Hugging Face Space tagged with docker, region:us. It has 16617 likes on Hugging Face.

HF Spaces
3.8K likesHF

Apply the motion of a video on a portrait Live Portrait is a Hugging Face Space tagged with gradio, Multimodal, Motion control, Image-to-Video, Video-to-Video. It has 3752 likes on Hugging Face.

HF Spaces
3.4K likesHF

Z Image Turbo is a Hugging Face Space tagged with gradio, mcp-server, region:us. It has 3437 likes on Hugging Face.

HF Spaces
3.3K likesHF

Text-to-3D and Image-to-3D Generation > Join our Wechat and Discord group to discuss and find help from us. “ Living out everyone’s imagination on creating and manipulating 3D assets.” Jan 21, 2025: 💬 Enjoy exciting 3D generation on our website Hunyuan3D Studio! Jan 21, 2025: 💬 Release inference code and pretrained models of Hunyuan3D 2.0. Jan 21, 2025: 💬 Release Hunyuan3D 2.0. Please give it a try via huggingface space our official site! We present Hunyuan3D 2.0, an advanced large-scale 3D s...

HF Spaces

The Counter

Voices from the AI bar today

26K views

A DeepMind researcher and Hannah Fry sketch an "agentic economy" where millions of AI agents transact and delegate autonomously.

Google DeepMind
14K views

A former Google Brain leader argues today's models still have "baby vision" for visual-spatial reasoning.

Inside the Silicon Mind with Firas Sozan
28K views

Breaks down OpenAI's specialized GPT Cyber security model, reportedly outperforming Anthropic's Mythos 5.

AI Revolution
6,110 engagements

Launch of "Aside," an AI browser with vertical tabs and Liquid Glass, claiming SOTA on agentic-browsing benchmarks plus on-device privacy.

@hyojun_at
846 upvotes · 252 comments

A crowd-sourced map of Chinese domestic AI-accelerator vendors reaching H100/H200-class performance.

r/LocalLLaMA
332 upvotes · 185 comments

A builder describes selling a fully Claude-generated app, sparking an argument about agent-built software economics.

r/AI_Agents

Last Sip

Parting thoughts

If there's a single thread today, it's that the AI race quietly became a memory and silicon race. Micron out-earning Nvidia and Meta on margin, OpenAI taping out Jalapeño in nine months, SpaceX renting compute to the same labs it competes with — the value is pooling in whoever controls the parts everyone else has to buy. Meanwhile the layer on top keeps getting cheaper, with GLM 5.2 doing a coding day for $3.36. Worth chewing on which end of that stack you're building on.