Agentic Brew Daily
Your daily shot of what's brewing in AI
Fresh Batch
- As Nvidia, Meta, and SoftBank all rush to rent out AI compute, CoreWeave and Nebius fell 14-17% and analysts openly question whether the financing is circular.
- Token pricing is hitting a wall from both ends: Tesla capped engineers at $200 a week while Coinbase cut AI costs 80% by routing routine work to open models.
- Alibaba banning Claude Code and Godot barring AI coding agents show enterprise and open-source projects are pulling back on trust as agents gain deeper system access.
Bold Shots
Today's biggest AI stories, no chaser
On July 1, Palantir CEO Alex Karp went on CNBC's Squawk Box and torched OpenAI's and Anthropic's per-token pricing, saying "something has gone completely wrong" with how AI gets sold. His pitch: enterprises pay for tokens that create no value while frontier labs quietly absorb their proprietary data and IP — a "wealth tax" on American business. The timing wasn't an accident; it followed Palantir's June 29 deal with Nvidia to ship Nemotron open models via a Sovereign AI Operating System that lets customers run and own model weights air-gapped. PLTR jumped more than 9% on the day.
Why it matters: Karp is the loudest voice yet in a real enterprise backlash against usage-based pricing, and his "tokenmaxxing" framing shifts the debate from model capability to who owns the data and the value. It's also a sales pitch for Palantir's open-weight stack — but the data-ownership argument is clearly landing with buyers who feel they're paying without ROI.
Palantir CEO Alex Karp on what customers actually want, the real business of frontier labs, and the importance of open source models: What the technical customers want is control over their compute, their models, their data stack, and their alpha.
Palantir's CEO just exposed Sam Altman and Dario Amodei for robbing every Fortune 500 company. Within two minutes, Alex Karp took the entire frontier AI industry apart on national television. His exact words: Every single enterprise in this country, these people are LIVID.
OpenAI reportedly proposed handing the US government a roughly 5% equity stake, held through a sovereign wealth fund modeled on Alaska's 1976 Permanent Fund. The vision goes further: other leading US labs would grant matching 5% stakes. At OpenAI's ~$852B post-money valuation, that slice is worth around $42.6B. Talks are still preliminary, and any real transfer would likely need congressional sign-off.
Why it matters: Framed as "sharing AI's upside," this reads more like a $42.6B regulatory put — equity traded for political goodwill to clear regulatory and IPO hurdles. It also creates a structural conflict: a government that's both shareholder and regulator may be less eager to enforce costly safety rules.
Meta is building a two-track cloud operation, "Meta Compute": one lane sells hosted access to its own AI models (think AWS Bedrock), the other sells raw compute capacity (think CoreWeave and Nebius). The market read it instantly — CoreWeave fell about 13.9% and Nebius sank about 17% in a single session, while Meta rose as much as 8.6% premarket. The twist: Meta is the largest anchor customer of both neoclouds, with combined commitments near $48B ($21B to CoreWeave, up to $27B to Nebius), and now a potential rival.
Why it matters: Meta going from customer to competitor exposes exactly the customer-concentration risk those firms were built on — laid bare in one trading day. Anthropic is reportedly in final talks to run private Claude on Meta's infrastructure, which tells you where this is heading.
CoreWeave and Nebius are two of the most undervalued stocks in the entire AI infrastructure space. And the Meta cloud announcement this week just made the case more obvious than ever.
NEW: Meta is reportedly developing a cloud infrastructure business to sell AI computing power, setting up competition with AWS, Azure, and Google Cloud, per Bloomberg.
Nvidia rolled out an optional financing model for neocloud providers that pairs revenue-sharing with credit support: Nvidia collects its usual hardware revenue plus a recurring cut of the cloud revenue on supported capacity. Token credits let startups grab compute without upfront capital, and Nvidia agrees to rent back unused GPUs at a fixed rate. First named partners are Sharon AI (up to 40,000 GB300 GPUs in Australia) and Firmus Technologies (a 170,000-GPU, 360 MW campus in Batam, Indonesia). Separately, SoftBank stood up SB Neo to run a US GPU-rental business at up to 10 GW scale.
Why it matters: The model converts one-off sales into recurring rent and deepens Nvidia's lock-in — but analysts warn it looks like circular financing, with Nvidia effectively pre-funding purchases of its own GPUs and masking whether AI demand is truly organic.
Woah. Nvidia $NVDA just created a new line of business for themselves. So, all those neoclouds like $CRWV $NBIS $IREN $APLD $SPCX that have been getting deals with hyperscalers worth billions? It's because demand for compute, according to Jensen, is growing at a level that...
$NVDA is reportedly offering financial backstops to smaller GPU cloud providers in exchange for a cut of their cloud revenue, per The Information.
Microsoft launched Microsoft Frontier Company, a new operating business backed by a $2.5B investment that drops roughly 6,000 industry and engineering experts directly inside customer organizations. Microsoft frames it as going beyond standard forward-deployed engineering. Crucially, the unit is model-neutral — OpenAI, Anthropic, Microsoft AI, open source — with a commitment not to train commoditizing models on customer data or IP. Early named customers span finance, agriculture, consumer goods, and pharma: LSEG, Land O'Lakes, Unilever, Novo Nordisk.
Why it matters: Microsoft is betting $2.5B that enterprise AI only pays off when humans sit on-site and wire it into real processes — a quiet admission its earlier single-vendor Copilot approach fell short. The model-neutral, IP-protection framing directly echoes the enterprise data-ownership anxieties Karp is exploiting.
Microsoft Frontier Company is here. A new operating business built for Frontier Transformation, powered by deep industry knowledge, change management, and enterprise-grade AI engineering.
The pace of AI adoption is moving incredibly fast. Customers want measurable business outcomes and their enterprise IP protected. Today, Microsoft is launching Microsoft Frontier Company, a $2.5B investment with 6,000 industry and AI engineering experts.
Slow Drip
Blog reads worth savoring
With per-token costs varying up to 20x, intelligent routers (Factory Router, Not Diamond, LiteLLM, Kilo) now cut AI spend 20-30% by sending simple tasks to cheaper/open models — and hosted open models already handle ~60% of coding work.
Vercel's Chief of Software explains the primitives agents actually need — resumability, sandboxed execution, portable "skills" to patch stale training data, and serving Markdown instead of HTML so agents can read your site.
A measurable prompt-engineering loop: DSPy runs prompt variants against a gold dataset on live instances, revealing that table-names-only schemas plus a "don't call describe_table" line push models to guess columns and spiral into retry loops — the fix is to list column names or soften that advice.
A field guide to catching reward hacking before deployment: keep an external evaluation separate from the training reward, watch for climbing training reward with flat validation, and inspect trajectories in MLflow (with SOP-Bench as the worked example).
The Grind
Research papers, decoded
LingBot-Map is a feed-forward 3D foundation model that rebuilds camera poses and dense point clouds directly from a video stream in a single pass, replacing slow iterative optimization. Its Geometric Context Transformer splits attention into an anchor context, a pose-reference window, and a compact trajectory memory that corrects long-range drift. It runs at ~20 FPS on 518x378 input and stays stable over 10,000+ frame sequences (7.11m trajectory error vs. 32.47m for CUT3R on Oxford Spires; F1 98.98 vs. 77.28 on ETH3D).
Instead of generating box coordinates one token at a time, LocateAnything's Parallel Box Decoding predicts a whole box as one atomic unit, preserving geometry and unlocking parallelism. Trained on a curated 138M-query dataset (785M annotated boxes), it reports >10x speedup over text-based VLMs and 2.5x over quantized methods while improving accuracy (+3.8% F1 on LVIS, 60.3 F1 on ScreenSpot-Pro GUI grounding). No code released yet.
Orca is an early general world foundation model built around next-state prediction. It learns one shared world latent space two ways — "unconscious" learning from 125K hours of unlabeled video and "conscious" learning from 160M event annotations plus 11.5M VQA pairs — then freezes the backbone and trains only tiny decoders. Orca-4B beats similar-sized specialists across text (51.8 avg vs. Qwen3.5-4B's 46.7) and image prediction (59.8 vs. FLUX.2's 56.1), and produces embodied actions competitively despite never seeing action labels in pre-training.
The Mill
Builder tools ground for action
Claude Code is an agentic coding tool that lives in your terminal, understands your codebase, and helps you code faster by executing routine tasks, explaining complex code, and handling git workflows - all through natural language commands.
An agentic skills framework & software development methodology that works.
Open-source AI penetration testing tool to find and fix your app’s vulnerabilities.
Context.dev is the web context API for AI products and agents. Scrape any URL, crawl sites, turn pages into LLM-ready Markdown, extract structured data into your own schema, capture screenshots, and retrieve logos, colors, fonts, styleguides, company data, and transaction enrichment through one API. YC-backed, no card required, and built so developers or coding agents can integrate in minutes.
The Counter
Voices from the AI bar today
A step-by-step build of a high-performance local coding agent pairing a self-scaffolding RL model with DeepSeek's DSpark speculative decoding for up to 85% faster inference.
NVIDIA's VP of Applied Deep Learning Research on why a hardware company builds open models (Nemotron), with depth on 4-bit pretraining and Mamba-Transformer hybrids.
Unpacks new U.S. export controls on frontier models and a voluntary model-approval regime, and what it means for autonomous-AI policy.
The run's dominant X conversation on AI agents and self-improving loops — a full autonomous AI employee wired with email, phone, memory, tools, and sub-agents.
The AI energy and memory-price bill hits the mainstream, with GenAI's power draw framed as bloat and waste.
The most-discussed thread this run: a debate on the vertical-integration land grab and whether custom silicon is a moat or a hedge against NVIDIA.
Argues closed labs may not disclose their real capabilities, so the open/closed gap is narrower than headlines suggest.
Roast Calendar
Your AI week, day by day
Last Sip
Parting thoughts
If there's one thread tying today together, it's that the whole industry is arguing about who pays and who owns the upside. Nvidia, Meta, and SoftBank all want to be your GPU landlord; Palantir's Karp wants to tear up the meter entirely; and OpenAI is floating a 5% stake to Washington. Meanwhile the practical answer is quietly showing up in the blogs — smart model routing and open models cutting real bills by 20-80%. Happy July 4th, and enjoy the fireworks.