Jun 23, 2026

Agentic Brew Daily

Your daily shot of what's brewing in AI

Fresh Batch

Distilled trend

OpenAI claims GPT-5.5-Cyber beats Anthropic's Mythos on CyberGym the same day Washington bans Anthropic's models for foreign users, attacking the rival on two fronts at once.
Anthropic loses US market access for foreign nationals yet simultaneously gains DeepMind's John Jumper and a Micron memory deal, trading reach for talent and supply.
Sakana's cheap Fugu router and Chinese labs' price cuts of up to 99% undercut frontier pricing, even as SpaceX's $6.3B Reflection lease shows infrastructure costs exploding.

Bold Shots

Today's biggest AI stories, no chaser

OpenAI ships GPT-5.5-Cyber and "Patch the Planet"

OpenAI expanded its Daybreak cybersecurity push on June 22, releasing the full GPT-5.5-Cyber model, an updated Codex Security plugin, a Cyber Partner Program, and a Patch the Planet initiative founded with Trail of Bits and HackerOne. The model stays gated to vetted defenders through the Trusted Access for Cyber program, and OpenAI is pointing at CyberGym scores (85.6%) to claim it edges out Anthropic's Mythos. The UK AI Security Institute found GPT-5.5 solved a 20-hour, 32-step expert network attack end to end. Patch the Planet funds researchers to fix bugs in 30+ critical open-source projects including cURL, Go, and Python.

Why it matters: OpenAI's tiered gating is a deliberate counter to Anthropic's full withholding of Mythos - a bet that calibrated distribution to defenders beats locking capability away. The dual-use stakes are concrete, and the same access controls that protect defenders could lower the barrier for offensive actors if they fail.

We want to help all companies be secure, working with the USG and the security ecosystem. The full version of GPT-5.5-Cyber is here; state of the art performance on CyberGym. Patch The Planet and Codex Security will help solve security problems instead of just finding them.

@sama·4.2K engagements

JUST IN: OpenAI's new GPT-Cyber model beat Mythos on the CyberGym benchmark.

@Polymarket·2.3K engagements

Sakana AI launches Fugu, an orchestration model

Tokyo-based Sakana AI launched Sakana Fugu, a multi-agent orchestration system delivered as a single foundation model that dynamically coordinates frontier LLMs from a swappable pool via one OpenAI-compatible API. Fugu is itself a language model trained to call, delegate to, verify, and synthesize other LLMs. The Ultra variant reportedly matches or beats Anthropic's Fable 5 and Mythos Preview, and outperformed individual models on 10 of 11 benchmarks. It landed 10 days after the Fable 5 / Mythos export controls.

Why it matters: Sakana pitches Fugu as redundancy-by-design that routes around single-vendor restrictions, betting the orchestration layer is the real product. Skeptics counter that the scores belong to the models Fugu calls, that it's a closed orchestrator on top of closed models, and that one test found it roughly 17x more expensive than GLM-5.2.

Introducing Sakana Fugu: A full multi-agent orchestration system accessible via a single model API. Our 'Fugu Ultra' model matches the performance of Fable and Mythos, delivering frontier capability without the risk of export controls. Try it: sakana.ai/fugu

@SakanaAILabs·26.7K engagements

SAKANA FUGU ULTRA vs. CLAUDE OPUS 4.8 RESULTS. Prompt: 'build a really high quality single html file crossy road game with three.js'. Sakana Fugu Ultra: Tokens Used ~89k ($7.32), Time Elapsed 22 minutes. Issues: inverted directional turn, wonky camera, no sfx.

@markksantos·2K engagements

US restricts Anthropic's Fable 5 and Mythos 5

On June 12, a US export-control directive ordered Anthropic to suspend all access to Fable 5 and Mythos 5 for any foreign national. Unable to verify nationality in shared cloud, Anthropic disabled both models worldwide. The trigger: Amazon researchers found Fable 5 refused to "review the code for security issues" but produced patches when asked to "fix this code." By June 19, Trump said he no longer views Anthropic as a national security threat, but the formal Commerce order and a Pentagon supply-chain designation stayed in force.

Why it matters: This is the first use of export-control authority to disable a live, commercial frontier model - a "kill switch" precedent that now hangs over every US lab. Because the same code-review capability exists in uncontrolled rival models, it read as selective enforcement, and it became a sovereignty shock in Europe.

SpaceX signed a compute lease worth up to $6.3B with open-source startup Reflection AI, giving immediate access to Nvidia GB300 chips at Colossus 2 near Memphis. The deal runs at $150M/month from July 1, 2026 through 2029, with either party able to terminate on 90 days' notice after the first three months. Reflection is Colossus's third external tenant after Anthropic and Google.

Why it matters: The bigger story is SpaceX becoming a hyperscale compute landlord, turning infrastructure built for xAI's Grok into a commercial GPU-rental business. The deal drew heavy "circular financing" criticism - Nvidia invested $800M in Reflection, which now pays SpaceX to rent Nvidia chips SpaceX bought from Nvidia. And the $6.3B headline is a conditional ceiling, not a committed floor.

Micron and Anthropic sign a strategic agreement

Micron and Anthropic announced a strategic agreement on June 22: memory and storage co-design, a multi-year supply deal, enterprise Claude adoption at Micron, and a Micron investment in Anthropic's Series H. That round closed May 28, raising $65B at a $965B post-money valuation, with Micron, Samsung, and SK hynix as strategic infrastructure partners. Micron stock rose around 5.5% to a record close, up 300%+ YTD.

Why it matters: The supplier now owns equity in its customer - a circular deal where money Micron invests can flow back as Anthropic memory purchases while the stake captures upside on demand Micron helps supply. The thesis underneath: memory (HBM), not the GPU, is the AI bottleneck.

Slow Drip

Blog reads worth savoring

Analysis · Lenny's NewsletterHow Claude Mythos found a 15-year-old bug in Mozilla Firefox

Mozilla's goal-loop harness (LLM scorer ranks risky files, verifier subagent catches false positives, humans approve every patch) shipped 423 security fixes in a month, showing the agent infra matters as much as the model.

Research · Hugging Face BlogPP-OCRv6 on Hugging Face: 50-Language OCR from 1.5M to 34.5M Parameters

A three-tier OCR family (1.5M/7.7M/34.5M params) covering 50 languages with the new RepLKFPN detector hits 86.2% detection Hmean / 83.2% recognition, runnable via Paddle, Transformers, or ONNX.

News · OpenAI ResearchSamsung Electronics brings ChatGPT and Codex to employees

One of OpenAI's largest enterprise rollouts puts ChatGPT Enterprise and Codex in front of all Samsung Electronics staff, with weekly Codex users in Korea up ~800% since February.

Tutorial · The AI CornerShip your first AI agent in a day

A 7-step build path with a ~15-line copy-paste Agent SDK starter, a build-vs-script decision rule that saves two weeks, and a fully-wired inbox-triage example to clone.

The Grind

Research papers, decoded

Models & Agentic Coding332 upvotes · alphaxiv

GLM-5.2: Built for Long-Horizon Tasks

Z.ai's new MIT-licensed open-source flagship targets long-horizon agentic coding with a stable 1M-token context. Its headline trick is IndexShare, which reuses a single indexer across every four sparse-attention layers to cut per-token FLOPs by 2.9x at 1M tokens, plus an upgraded MTP layer for speculative decoding. It tops the open-source rankings on FrontierSWE, PostTrainBench, and SWE-Marathon. A genuinely open coding model with production-grade long-context efficiency, worth swapping in for agent loops that previously needed a frontier closed model.

Architecture & Efficiency191 upvotes · alphaxiv

Looped World Models

LoopWM is the first looped architecture for world modeling: instead of stacking deeper feed-forward layers, it iteratively refines latent environment states through one parameter-shared transformer block, with spectral constraints keeping long-horizon rollouts stable. The result is up to 100x parameter efficiency over conventional world models, with adaptive compute that scales depth to each step's difficulty. Iterative latent depth is a new scaling axis, relevant if you're running world models on edge hardware without ballooning parameter counts.

Architecture & Efficiency114 upvotes · alphaxiv

Variable-Width Transformers

This challenges the convention that every transformer layer should be the same width. An inverted-hourglass / bowtie design (wider at the input and output layers, narrower in the middle, glued by a parameter-free residual-resizing mechanism) consistently beats parameter-matched uniform baselines on language-modeling loss across 200M-3B models, dense and MoE. It also cuts FLOPs (~22% under loss-matched scaling) and shrinks KV-cache memory/IO (~15%). A drop-in architectural change that buys real inference savings.

The Mill

Builder tools ground for action

136.8K stars

firecrawl/firecrawl

The API to search, scrape, and interact with the web at scale. 🔥

GitHub

73K stars

bytedance/deer-flow

An open-source long-horizon SuperAgent harness that researches, codes, and creates. With the help of sandboxes, memories, tools, skill, subagents and message gateway, it handles different levels of tasks that could take minutes to hours.

GitHub

29.7K stars

heygen-com/hyperframes

Write HTML. Render video. Built for agents.

GitHub

11.1K stars

DeusData/codebase-memory-mcp

High-performance code intelligence MCP server. Indexes codebases into a persistent knowledge graph — average repo in milliseconds. 158 languages, sub-ms queries, 99% fewer tokens. Single static binary, zero dependencies.

GitHub

The Counter

Voices from the AI bar today

views

The AI Bubble is Bleeding Cash, Here Are The Receipts

A data-driven critique of AI economics, citing OpenAI's $34B spend and $21B loss in 2025 and questioning agentic-model profitability.

The Monetary Matters Network

views

NVIDIA's Dominance Is Already Over?

Argues NVIDIA's moat is CUDA, system integration, and TSMC access, examining why Amazon Trainium and Google TPU haven't dented its lead.

ByteMonk

upvotes

Claude Opus caught malware hidden in my repo, then reverse engineered the whole [thing]

A real-world case where Claude Opus detected and reverse-engineered hidden malware in a code repo.

r/ClaudeAI

upvotes

The human brain runs on 15W. Simulating it in real time would need 2.7 billion w[atts]

A debate over the staggering energy gap between biological cognition and digital brain simulation.

r/ArtificialInteligence

Roast Calendar

Your AI week, day by day

Tue23

12:00 PM PT•Stanford

Panel Discussion: Path to General Embodied AI

5:30 PM PT•San Jose

PayPal AI Innovation Meetup

Jun 23 (kickoff)•Hackathon

DSH Pitch

Wed24

6:30 PM PT•San Francisco

Agent Evals: The Truth Machine w/ Composio, Braintrust, Fireworks, and Replit

4:30 PM PT•San Francisco

AWS Agentic AI Showcase - Accelerate Agentic AI in Production

Jun 25 (kickoff)•Hackathon

Dev Clash

Thu25

6:00 PM PT•San Francisco

How We Built It: LangSmith Engine (San Francisco)

6:00 PM PT•Menlo Park

Self-Improving Agents | ODSC AI and Snowflake

Jun 25 (kickoff)•Hackathon

Healthcare x AI Hackathon | HAIG Media

Fri26

3:00 PM PT•San Francisco

The New AI Scaling Axis: Neuro-Inspired Test-Time Cognition

6:00 PM PT•San Francisco

Agentic Engineering, Fancy Pizza, Wine

Jun 26 (kickoff)•Hackathon

$1,000 Industrial AI Hackathon

Sat27

Jun 26 - Jun 27•San Francisco

AINative Startup Camp - Build an AI-Native Startup in a Weekend

10:30 AM PT•San Francisco

Hackathon: Humanizing The Prototype

2:00 PM PT•Los Altos

Robotics & World Models Reading Club 15: Continuous Learning in Robotics

Sun28

9:00 AM PT•San Francisco

Wizard Hackathon

5:00 PM PT•San Francisco

AI Engineer World's Fair - New Engineer Orientation (IRL)

2:00 PM PT•San Francisco

A Floot Founders Hackathon @ Corgi Cafe

Mon29

6:00 PM PT•San Francisco

Artificial Analysis Intelligence Index

5:30 PM PT•San Francisco

Harness Engineering: State of the Art in Agent Harnesses

4:00 PM PT•San Francisco

Reading Group: Agents' Last Exam

Last Sip

Parting thoughts

A lot of today came down to one question asked five different ways: who gets to use the best models, and on whose terms? Whether it's gating defenders, banning foreign logins, routing around export controls, or buying a stake in your own supplier, the frontier fight is now about access and leverage as much as raw capability. Worth keeping in mind the next time a benchmark chart tries to tell you the whole story. Thanks for sharing a cup with us.

Agentic Brew Daily

Fresh Batch

Bold Shots

We want to help all companies be secure, working with the USG and the security ecosystem. The full version of GPT-5.5-Cyber is here; state of the art performance on CyberGym. Patch The Planet and Codex Security will help solve security problems instead of just finding them.

JUST IN: OpenAI's new GPT-Cyber model beat Mythos on the CyberGym benchmark.

Introducing Sakana Fugu: A full multi-agent orchestration system accessible via a single model API. Our 'Fugu Ultra' model matches the performance of Fable and Mythos, delivering frontier capability without the risk of export controls. Try it: sakana.ai/fugu

SAKANA FUGU ULTRA vs. CLAUDE OPUS 4.8 RESULTS. Prompt: 'build a really high quality single html file crossy road game with three.js'. Sakana Fugu Ultra: Tokens Used ~89k ($7.32), Time Elapsed 22 minutes. Issues: inverted directional turn, wonky camera, no sfx.

The Claude Shutdown Is A Total Sh*tshow..

US Government Bans Fable, Quietly Admits It Doesn't Know How AI Works

A corporate shakedown.

NSA

Slow Drip

The Grind

The Mill

The Counter

Roast Calendar

Last Sip