TECH

OpenAI and Broadcom unveil Jalapeno LLM inference chip

35+

Signals

Strategic Overview

01.
OpenAI and Broadcom unveiled Jalapeno on June 24, 2026 — OpenAI's first 'Intelligence Processor,' a custom accelerator architected around its vision for LLM inference and the first chip in a planned multi-generation compute platform.
02.
Jalapeno went from initial design to manufacturing tape-out in just nine months — described as possibly the fastest ASIC development cycle ever in high-performance advanced semiconductors — with OpenAI's own models helping accelerate the design.
03.
Engineering samples are already running ML workloads in the lab at production target frequency and power, including the GPT-5.3-Codex-Spark model, with initial deployment targeted for the end of 2026 at gigawatt scale.

Deep Analysis

Inference, not training, is where the money bleeds

Jalapeno is deliberately an inference chip, not a training chip — and that choice is the whole strategy. Training a frontier model is a periodic capital event; inference is the permanent, compounding cost of every ChatGPT reply, Codex completion, and API call OpenAI serves. By targeting inference, OpenAI is attacking the line item that scales with usage rather than with model releases. The economic stakes are stark: OpenAI is projected to burn through more than $200 billion in operating expenses through 2029 ^[5], and owning the silicon that runs its models is positioned as a primary lever to control that spend. Broadcom CEO Hock Tan put a number on the upside, citing roughly 50% cost savings versus typical AI GPUs ^[3], while OpenAI claims 'substantially better' performance per watt than current state-of-the-art in early testing ^[1].

The mechanism is specificity: a chip co-designed around OpenAI's own kernels, memory-movement patterns, and serving behavior can strip out the generality tax a merchant GPU pays to serve every customer's workload. Greg Brockman frames the broader logic as moving toward a 'compute-powered economy' where owning the full stack makes compute more abundant — a vertical-integration bet that only pays off if the chip actually runs OpenAI's specific models more cheaply than the Nvidia hardware it would otherwise rent.

The models helped design the chip that runs the models

The most striking technical claim is the cadence: nine months from initial design to manufacturing tape-out, which OpenAI describes as possibly the fastest ASIC development cycle ever achieved in high-performance advanced semiconductors ^[1]. ASIC programs of this complexity normally run multiple years. OpenAI attributes part of that compression to using its own models to accelerate the design loop — a recursive-acceleration story that landed hard with the optimist crowd online, where acceleration-focused communities greeted it as a milestone in self-improving infrastructure.

The substantive read is narrower than the hype. The partnership built on a relationship announced in October 2025 to co-develop 10 gigawatts of accelerators ^[6], so the nine-month clock starts well into an existing collaboration and almost certainly leans on Broadcom's mature networking and connectivity IP rather than a clean-sheet design — a point local-AI skeptics raised directly, doubting a from-scratch tape-out could move that fast. What is concrete is that engineering samples are already running real workloads in the lab at production frequency and power, including the GPT-5.3-Codex-Spark model ^[4], which is a firmer signal of progress than the headline timeline.

A warning shot at Nvidia, not a divorce

Jalapeno is best read as leverage, not a break. OpenAI still depends on Nvidia for training, and commentators frame the chip as a 'warning shot' rather than a clean exit — the message being that the biggest AI buyers now want negotiating power over their hardware bill ^[2]. The strategic context is a broad hyperscaler custom-silicon wave: Google has TPUs, Amazon and Meta have their own accelerators, and Tan's prediction that every frontier-model developer will eventually build its own dedicated accelerator ^[3]reframes Jalapeno as table stakes rather than a one-off.

The pressure point is inference pricing: if the largest customers self-supply the highest-volume workload, Nvidia's pricing power on inference erodes even if training stays locked in. Deployment is meant to start by the end of 2026 at gigawatt scale, with Microsoft expected to take roughly 40% of initial production ^[4]and install the chips across its data centers — a structure that hands a single partner outsized influence over how fast the platform scales.

The skeptic's column: self-reported, unbaselined, and not yet shipped

Every headline performance number here is first-party and, so far, unverifiable. 'Substantially better' performance per watt ^[1]and the 50% cost figure ^[3]arrive without a named baseline, which is exactly the gap technical skeptics flagged — performance-per-watt claims mean little without specifying what they beat. The same readers questioned whether a nine-month tape-out is credible without leaning heavily on existing Broadcom IP, and noted the announcement is thin on architectural detail, with a full technical report promised only in the coming months ^[4].

Community sentiment split cleanly along those lines: acceleration-minded readers treated the unveiling as a euphoric milestone, while local-AI communities shrugged it off as datacenter-only news irrelevant to anyone running models on their own hardware. Execution risk is also recent and concrete: the roughly $18 billion first phase hinged on Microsoft committing to buy about 40% of production amid a data-center-design disagreement ^[5], a dependency that turned the deal into a financing sticking point earlier in 2026. Until silicon ships at scale and an independent benchmark lands, Jalapeno is a credible strategy backed by impressive but self-graded claims.

Historical Context

2023

Rumors about OpenAI's plans to build its own AI chips had circulated since 2023.

2025-10-13

OpenAI announced a collaboration with Broadcom to co-develop 10 gigawatts of custom AI accelerators, with deployments slated to begin in 2026.

2026-05-08

The roughly $18 billion first phase (codenamed 'Project Nexus,' ~1.3 GW) hit a financing snag: Broadcom would finance phase one only if Microsoft agreed to buy ~40% of the chips, and Microsoft hesitated over data-center design differences.

2026-06-24

OpenAI and Broadcom unveiled Jalapeno, OpenAI's first AI chip, built for inference.

Power Map

Key Players

Subject

OpenAI and Broadcom unveil Jalapeno LLM inference chip

OpenAI

Designed the chip from scratch around its LLM roadmap; will use Jalapeno for inference to serve ChatGPT and other products; advancing its goal of building the full AI stack and reducing Nvidia dependence.

Broadcom (AVGO)

Manufacturing and silicon partner; provides networking technology and connectivity expertise; CEO Hock Tan publicly cited ~50% cost savings vs typical AI GPUs and predicted every frontier model developer will eventually build its own dedicated accelerator.

Celestica

Industrialization partner handling board, rack, and system integration and scalable production systems.

Microsoft

Deployment partner expected to purchase ~40% of initial production and install the chips in its data centers; a financing condition and infrastructure-design disagreement made Microsoft's firm commitment a sticking point earlier in 2026.

Nvidia

Incumbent AI-chip supplier OpenAI has relied on almost exclusively; remains key for training, but Jalapeno targets inference economics and challenges Nvidia's pricing power.

Fact Check

6 cited

Source Articles

Top 5

THE SIGNAL.

Analysts

"Frames Jalapeno as part of OpenAI's long-term full-stack infrastructure strategy to make compute more abundant in a compute-powered economy."

Greg Brockman

President and Co-Founder, OpenAI

"Emphasizes the chip was designed from the ground up for LLM inference using insights from close collaboration with OpenAI researchers."

Richard Ho

Hardware Program Lead, OpenAI

"Frames the collaboration as a commitment to scaling AI's physical infrastructure; cited ~50% cost savings vs typical AI GPUs and predicted every frontier model developer will eventually build its own dedicated AI accelerator."

Hock Tan

President and CEO, Broadcom

"View Jalapeno less as a clean break from Nvidia than a 'warning shot' — once Google, Amazon, Microsoft, Meta and OpenAI all run serious custom-silicon programs, Nvidia's pricing power faces pressure, though Nvidia remains critical for training."

Industry commentators (aggregated)

Analyst and market commentary

The Crowd

"We've designed and built our first AI chip: Jalapeño. Designed from the ground up by OpenAI and brought to production with @Broadcom, Jalapeño is purpose-built for the LLM workloads powering ChatGPT, Codex, the API, and future agentic products. Chips are foundational to the AI"

@@OpenAI7977

"NEW: OpenAI unveils its first in-house AI chip, Jalapeño."

@@Polymarket606

"OpenAI and Broadcom have developed a custom artificial intelligence chip called Jalapeno. OpenAI is now testing the samples. The companies say this chip can cut costs by 50%. Ed Ludlow reports https://bloom.bg/4xLq1af"

@@BloombergTV1

"Holy Peak OpenAI has just announced its first AI chip, SOTA in performance per watt, where internal OpenAI models were used to accelerate it further, automating more parts of AI development loops and further accelerating AI development in turn, in partnership with BroadCom"

@u/GOD-SLAYER-69420Z137

Broadcast

OpenAI x Broadcom — The OpenAI Podcast Ep. 8

OpenAI and Broadcom sign 10GW deal

OpenAI's new AI chip