Apr 26, 2026

Agentic Brew Daily

Your daily shot of what's brewing in AI

Fresh Batch

Bold Shots

Today's biggest AI stories, no chaser

DeepSeek released V4 Preview as two MoE models, V4-Pro (1.6T total / 49B active) and V4-Flash (284B / 13B active), both with 1M-token context and a new Hybrid Attention scheme that cuts inference FLOPs to 27% and KV cache to 10% of V3.2. It is MIT-licensed on Hugging Face and ModelScope, and Huawei Ascend plus Cambricon shipped same-day inference support. Pricing lands at $1.74/$3.48 per million tokens for Pro and $0.14/$0.28 for Flash, undercutting GPT-5.5's $5/$30 by roughly 85%.

Why it matters: This is the first frontier-class open model that doesn't need NVIDIA to serve. SMIC popped ~10% on the read-through, and the entire China-on-Chinese-silicon stack just became production-viable.

OpenAI launched GPT-5.5 and GPT-5.5 Pro on April 23 with a 1.05M-token context, 128K max output, and full MCP plus computer-use support. Artificial Analysis says GPT-5.5 burns about 40% fewer output tokens than 5.4, but list pricing doubled to $5/$30 per million, so net effective cost is roughly 20% above the prior gen. TechCrunch's framing nailed it: OpenAI has stopped selling a chat completion API and started selling an agent.

Why it matters: The coding crown is genuinely contested — GPT-5.5 hits 58.6% on SWE-Bench Pro versus Claude Opus 4.7's 64.3% — and the API surface shift signals that 'agent' is now the default unit OpenAI sells.

Google is putting $10B in cash now at a $350B valuation with another $30B contingent (cap $40B), plus 5 GW of Google Cloud compute over five years and up to a million seventh-gen Ironwood TPUs. Days earlier Amazon expanded its own Anthropic commitment by $5B (cap $25B). Anthropic ARR has jumped from ~$9B at end of 2025 to north of $30B in April 2026, with more than 1,000 customers spending over $1M a year.

Why it matters: This is the Google-Anthropic answer to Microsoft-OpenAI, and it validates Ironwood TPUs as a real NVIDIA alternative at frontier scale. As eMarketer's Gadjo Sevilla put it, Google isn't trying to win the model race — it's locking in infrastructure dominance.

Governor Janet Mills vetoed L.D. 307 on April 24, citing a $550M data center redevelopment at the former Androscoggin Mill in Jay. She did sign L.D. 713, which blocks data centers from accessing the state's business development tax incentives. Good Jobs First counted more than 300 data center bills across 30 states in the first six weeks of 2026, and at least a dozen in-session moratorium proposals were modeled on Maine's.

Why it matters: L.D. 307 was the most procedurally advanced of the bunch, so the veto removes the precedent that sponsors elsewhere were planning to cite. Override pressure is organizing fast, and Mills is running for U.S. Senate, which makes this a national story dressed up as a state veto.

Intel closed just under $83 on April 24, up about 24%, its best single-day gain since October 1987 and YTD gains north of 100%. Q1 2026 revenue came in at $13.58B (up ~7% YoY and ~$1.2B above consensus), Data Center & AI revenue grew 22% to $5.1B, and non-GAAP EPS hit $0.29 against a $0.01 estimate. Intel Xeon 6 was named host CPU for NVIDIA's DGX Rubin NVL8, Tesla's Austin 'Terafab' became Intel's first announced major external 14A customer, and the U.S. government's 9.9% stake is now worth ~$36B against an ~$8.9B cost basis.

Why it matters: AI-driven businesses are now ~60% of Intel's revenue. CEO Lip-Bu Tan put it bluntly: 'the CPU is reasserting itself as the indispensable foundation of the AI era.' The taxpayer is sitting on a ~$26.5B unrealized gain on the CHIPS-to-equity conversion.

The Blend

Connecting the dots across sources

The unit of work in AI is now 'the agent' — not the chat

  • OpenAI shipped GPT-5.5 with computer-use and MCP baked in, and TechCrunch literally wrote that OpenAI stopped selling a chat completion API and started selling an agent.
  • Anthropic's Project Deal experiment had Claude run 186 deals worth $4,000 across 69 employees in their SF office, which is what 'agent as marketplace participant' looks like in production.
  • A 38-author red-team study found today's agents operate at 'Level 4 autonomy' with only 'Level 2 comprehension,' which both validates the shift and warns about it.
  • Two of this Sunday's SF meetups are literally branded 'Agent Builders' and 'Agent-Whispering,' which is the in-person mirror of the same trend.

The compute stack is being un-NVIDIA-fied in slow motion

  • Google promised Anthropic up to a million seventh-gen Ironwood TPUs alongside its $40B commitment, which is the largest single bet on a non-NVIDIA frontier-training stack ever made.
  • Meta signed a multibillion-dollar AWS Graviton5 deal for tens of millions of Arm CPU cores, with Counterpoint projecting Arm at 90% of AI ASIC server CPU share by 2029.
  • DeepSeek V4 launched with same-day inference support on Huawei Ascend and Cambricon, and SMIC popped ~10% on the read-through — the first frontier-class open model that doesn't need NVIDIA to serve.
  • Intel had its best stock day since 1987 with Xeon 6 named host CPU for NVIDIA's own DGX Rubin NVL8, signaling the CPU layer is being rebalanced even inside NVIDIA's flagship platform.

The agent productivity story is louder than the agent productivity evidence

  • Salesforce reportedly cut 4,000 reps as ServiceNow pitched a control-layer for agents — that's the bullish enterprise framing in raw form.
  • A widely-shared empirical study showed experienced developers were measurably 19% slower with AI assistants while believing they were 24% faster, a self-perception gap that's hard to wave away.
  • The most-cited red-team paper of the day catalogued 11 distinct failure modes including agents leaking email threads they had just refused to summarize and propagating malicious instructions to other agents.
  • Even inside Anthropic's own user base, top community posts this week were revolt-flavored about Claude Code being pulled from the Pro plan, while the company was simultaneously closing a $40B deal.

Slow Drip

Blog reads worth savoring

Analysis · SemianalysisThe Coding Assistant Breakdown: More Tokens Please

A hands-on shootout of GPT-5.5, Opus 4.7, and DeepSeek V4 that cuts through benchmark theater to call who actually wins the coding-assistant wars.

Analysis · The Neuron AIDeepSeek V4 admits it's 3-6 months behind. The architecture says otherwise.

DeepSeek's own 50-page paper concedes V4 trails on intelligence, but the long-context race just got won at one-seventh the cost.

Tutorial · Simon WillisonGPT-5.5 prompting guide

A tight distillation of OpenAI's new prompting playbook, including the 'acknowledge before tool calls' trick that makes long-running agents feel alive instead of crashed.

Tutorial · Data Science CollectiveHow to Build Self-Healing AI Agents with Monocle, Okahu MCP and OpenCode

A hands-on walkthrough for giving coding agents access to their own telemetry so they can debug and fix themselves without you reading the logs.

News · The Neuron AIAround the Horn Digest: Everything That Happened in AI Today (Friday, April 24, 2026)

One-stop catch-up on V4, Google→Anthropic, and Meta→AWS — all four moves on April 24 in one tab.

News · Latent Space[AINews] DeepSeek V4 Pro (1.6T-A49B) and Flash (284B-A13B), Base and Instruct — runnable on Huawei Ascend chips

The most engaged-with breakdown (43 reactions) of DeepSeek's surprise drop.

The Grind

Research papers, decoded

Agent Safety3,466 upvotes · arxiv
Agents of Chaos

Twenty scientists spent two weeks red-teaming LLM-powered agents that had real email, messaging, file system, and shell access — and catalogued 11 distinct failure modes including leaking email threads they had just refused to summarize and propagating malicious instructions to other agents. The headline: today's agents act at 'Level 4 autonomy' with only 'Level 2 comprehension.' If you're shipping anything agentic into production, treat this as required reading.

Long Context2,450 upvotes · arxiv
Recursive Language Models

Instead of stuffing entire documents into a context window, MIT CSAIL's RLM treats them as external data the LLM queries programmatically — slicing, recursing, and writing code to inspect itself. The result: 100x larger inputs at roughly the same cost, and 91%+ accuracy on multi-doc tasks where standard models score 0%.

Agent Training160 upvotes · alphaxiv
Agent-World: Scaling Real-World Environment Synthesis for Evolving General Agent Intelligence

Tackles the agent-training data drought by autonomously mining MCP servers and tool docs to synthesize ~2,000 executable environments, then training agents via RL with a diagnostic arena that targets weaknesses. Agent-World-14B hits 65.4% on τ²-Bench and 55.8% on BFCL V4, and performance more than doubles (18.4% → 38.5%) as environments scale from zero to 2K.

Frontier Architecture68 upvotes · alphaxiv
DeepSeek-V4: Towards Highly Efficient Million-Token Context Intelligence

Two MoE models (1.6T Pro / 49B active and 284B Flash / 13B active) with a Compressed Sparse + Heavily Compressed Attention scheme that cuts KV cache to ~2% of baseline at million-token context. Reports 73% fewer FLOPs per token at 1M context, 90% memory cache reduction, and a Codeforces rating of 3206.

Time Series Reasoning79 upvotes · huggingface
LLaTiSA: Towards Difficulty-Stratified Time Series Reasoning from Visual Perception to Semantics

A 4-level taxonomy for time-series reasoning, an 83K-sample HiTSR benchmark, and a dual-view vision-language model that ingests both plots and numeric tables — fixing the trade-off where vision models nail patterns but botch numbers, while text models do the opposite. Hits 86.8% on numerical tasks and 97.5% on global pattern recognition.

On Tap

What's trending in the builder community

Ask Product Hunt AI

Product Hunt's own 'just ask' discovery surface, sitting at the top of the board.

Spira AI

AI Influencer that always on trend, create & grow your brand.

DeepSeek-V4

The open-source era of 1M context intelligence — the V4-Pro/Flash MoE preview series.

Beezi AI

Make AI development structured, secure, and cost-efficient.

Codex 3.0 by OpenAI

Cross-app coding agent that builds, tests, and debugs across browsers, Office, and Drive.

BAND

Coordinate and govern multi-agent work in a single chat.

GPT 5.5 Arrives, DeepSeek V4 Drops, and the Compute War Intensifies

AI Explained's data-rich benchmark side-by-side that everyone is sharing this week.

Google Cloud CEO: Anthropic, TPUs, Mythos, NVIDIA and more

Matthew Berman's interview giving cloud-side context for the $40B Anthropic deal.

AI Is Quietly Trying To Escape

Documentary-style synthesis of recent self-preservation, deception, and sandbox-escape findings.

Experienced developers took 19% longer with AI #aicoding #study #reality

Nate B Jones's empirical pushback on the agent-coding hype cycle.

ChatGPT Images Just Replaced Three People on Your Team.

Argues GPT-Image 2's 93% win rate marks a structural shift in the creative stack.

AI agents don't just need better prompts; they need 'Context Engineering.' It's about signal density.

@saudaziz crystallizing the new-skill discourse around context engineering.

find-skills

Discovery skill for the open agent skills ecosystem, ranked #1 on skills.sh with 1.2M installs.

self-improving-agent

Top of clawhub.ai right now (3,348 stars, 6,383 installs); a useful template for agents that rewrite their own prompts.

Roast Calendar

Upcoming events & gatherings

AI Native Camp - Santa Cruz & OnlineSun, April 26, 9 AM PT | Santa Cruz, CA
Agent Builders Meetup | Founder Park Global Series @ SFSun, April 26, 3 PM PT | San Francisco, CA
BETA 2026 Hackathon - Open RegistrationSun, April 26, 9 AM PT | San Francisco, CA
Product Salon: Breakfast with a Venture CapitalistSun, April 26, 8 AM PT | Menlo Park, CA
Agent-Whispering [Based entirely -via- L.O.T.R.] 3rd SESSIONSun, April 26, 9:30 AM PT | San Francisco, CA
Light DAO San Francisco Spring SocialSun, April 26, 2 PM PT | San Francisco, CA

Last Sip

Parting thoughts & a teaser for tomorrow

The wildest part of this week isn't any single launch — it's that DeepSeek, OpenAI, Google, and Intel all moved within 48 hours, and NVIDIA's stock barely flinched. Either the market thinks the diversification is priced in, or it hasn't fully woken up yet. I'd watch the override fight on Maine's L.D. 307 next, plus whether any U.S. hyperscaler quietly ports a workload to Ascend now that there's a frontier model that runs there. Tomorrow we'll see what the weekend's analyses make of the GPT-5.5 vs. Opus 4.7 vs. V4 coding-assistant gauntlet — the early shootouts are spicy. Drink it slow.