OpenAI Releases MRC Networking Protocol
TECH

OpenAI Releases MRC Networking Protocol

40+
Signals

Strategic Overview

  • 01.
    OpenAI, with AMD, Broadcom, Intel, Microsoft, and NVIDIA, released Multipath Reliable Connection (MRC), a new open networking protocol that helps large AI training clusters run faster and more reliably with less wasted GPU time.
  • 02.
    MRC is built into 800Gb/s network interfaces and lets a single RDMA transfer be sprayed across hundreds of paths, while rerouting around link or switch failures on a microsecond timescale instead of the seconds-to-tens-of-seconds typical of conventional fabrics.
  • 03.
    MRC is already deployed across OpenAI's largest NVIDIA GB200 'Blackwell' supercomputers, including the Oracle Cloud Infrastructure site in Abilene, Texas and Microsoft's Fairwater systems, and the specification has been contributed to the Open Compute Project under an open license.
  • 04.
    MRC extends RDMA over Converged Ethernet, draws on Ultra Ethernet Consortium techniques, and adds SRv6-based source routing — and its multi-plane design lets a 100,000+ GPU cluster be wired with just two tiers of switches instead of the three or four required by a conventional 800Gb/s build.

Packet Spraying and the Tyranny of the Slowest Link

The core insight behind MRC is that frontier AI training violates the assumption underneath the entire modern internet. Conventional networking gets its statistical magic from the law of large numbers: millions of independent flows, averaged across paths, smooth out into something predictable. Synchronous training on tens of thousands of GPUs is the opposite shape — every GPU is waiting on every other GPU at every step, so a single late transfer stalls the whole training step and leaves the rest of the cluster idle. With networks now containing millions of optical links, something is always failing somewhere, and a conventional RDMA flow pinned to a single path stalls for seconds when its link or switch goes down.

MRC's answer is to spray each transfer across hundreds of network paths simultaneously, baked into 800Gb/s NICs so the spraying happens at hardware speed. When a link or switch fails, MRC reroutes around it on a microsecond timescale rather than the seconds-to-tens-of-seconds typical of conventional fabrics — fast enough that the GPUs above never notice. OpenAI says the production proof was unceremonious: engineers rebooted four tier-1 switches during a frontier training run without coordinating with the training team, and MRC absorbed the disruption. That is a different posture toward failure than networking has historically taken, and as Ron Westfall puts it, it amounts to 'treating the entire AI fabric as a single fluid system instead of a series of isolated connections.'

Six Rivals, One Spec: Why the Lineup Is the Story

It is not common to see AMD, Broadcom, Intel, Microsoft, NVIDIA, and OpenAI line up behind a single specification on the same day, contributed under an open license to the Open Compute Project. These are companies that compete on GPUs, on NICs, on switching silicon, and increasingly on cloud capacity itself. The fact that they all chose to co-launch — with simultaneous blog posts from each vendor and Sachin Katti openly crediting NVIDIA for co-engineering Blackwell-generation deployment — is a signal in itself.

The motivation, in OpenAI's own framing, is defensive. Greg Steinbrecher says several hyperscalers were already building closed, in-house RDMA transports, and that 'type of market fragmentation is bad for the networking industry.' The community reaction so far has tracked that strategic logic: the OpenAI Podcast episode with Mark Handley and Greg Steinbrecher has been the dominant discussion vector, framing MRC as removing 'one of the key barriers to continuing to scale,' while investor-leaning commentary has read the multi-vendor backing as a tide that floats AMD, Broadcom, Microsoft, and NVIDIA at once. The bet is that publishing the spec via OCP — rather than letting each hyperscaler ship its own private transport — keeps the long tail of NIC and switch vendors targeting the same protocol surface, and prevents AI back-end networking from balkanizing the way storage networking did a generation ago.

The Ethernet-vs-InfiniBand Inflection That MRC Locks In

MRC arrives at a specific moment in the AI fabric debate. Dell'Oro tracks Ethernet sales to AI back-end networks surpassing InfiniBand in 2025, with hyperscalers pushing cluster sizes toward 100,000 and 500,000-plus GPUs. The strategic question for the industry has been whether Ethernet can credibly host RDMA-class transport at that scale, or whether AI customers would have to keep paying the InfiniBand tax to get the latency and loss behavior they need.

MRC's design is an Ethernet-native answer. It extends RoCE — the IBTA's RDMA-over-Ethernet standard — and layers on multipath transport plus SRv6 source routing borrowed from the Ultra Ethernet Consortium playbook. Just as importantly, its multi-plane topology lets a 100,000-plus GPU cluster be wired with two tiers of switches instead of the three or four tiers a conventional 800Gb/s network would need, which translates directly into fewer switches, less optics, and lower power per GPU. AMD's Krishna Doddapaneni names the underlying claim plainly: 'As GPUs and CPUs continue to drive compute, the real bottleneck in scaling AI is the network.' MRC is the bet that the answer to that bottleneck is open Ethernet plus a smarter transport — not another generation of proprietary fabric.

Open Spec, Selective Silicon: Where the Tension Is

The OCP contribution is being framed as a portability story, but at launch the production silicon picture is narrower than the press release suggests. NVIDIA describes MRC as in production on Spectrum-X — specifically the Spectrum-4 and Spectrum-5 generations — and OpenAI says its frontier training has run on hardware from NVIDIA and Broadcom. AMD and Intel are co-authors of the spec and named partners, but the visible deployments at the moment lean heavily on the NVIDIA and Broadcom stacks.

That gap is the live tension to watch. An open specification is only as portable as the second and third interoperable implementations that ship against it. If AMD's Pensando-style NICs and Intel's networking silicon land MRC implementations that interoperate cleanly with Spectrum-X and Broadcom Tomahawk, the OCP contribution will look like the moment AI back-end networking became a multi-vendor commodity layer. If instead MRC ends up de facto bound to one or two silicon families, it will look more like a co-marketed standard — open in license, narrow in practice. The protocol's two-year development effort, the production deployments at OCI Abilene and Microsoft Fairwater, and the coordinated multi-vendor launch all argue for the optimistic read; the durability of that read depends on what ships next.

Historical Context

2010-04-01
IBTA released the original RoCE specification, enabling hardware-accelerated RDMA over Ethernet — the substrate that MRC now extends with multipath transport and SRv6 source routing.
2023-07-19
UEC was formed by AMD, Broadcom, Intel, Microsoft, and others to evolve Ethernet for AI/HPC workloads. OpenAI explicitly says MRC draws on UEC techniques and extends them with SRv6-based source routing for large-scale AI fabrics.
2024-01-01
OpenAI began the roughly two-year MRC development effort, eventually folding in NIC and switch partners across the GPU stack to co-engineer a protocol that could survive frontier-scale training runs.
2025-11-10
Katti left Intel — where he was CTO/AI chief and led networking — to join OpenAI to build out compute infrastructure for AGI. He has since become a public face of the MRC effort, bridging the multi-vendor partnership.
2026-05-06
OpenAI publicly released MRC and contributed the specification to the Open Compute Project, with simultaneous blog posts from AMD, NVIDIA, Microsoft, and Broadcom — a coordinated launch that signaled cross-vendor alignment on an open AI fabric standard.

Power Map

Key Players
Subject

OpenAI Releases MRC Networking Protocol

OP

OpenAI

Lead architect of MRC; the protocol is deployed across its frontier-model training clusters, and OpenAI led the contribution of the spec to the Open Compute Project to standardize it for the wider industry.

NV

NVIDIA

Implements MRC at hardware speed in its Spectrum-X Ethernet switches and SuperNICs paired with GB200 GPUs, with MRC now in production on Spectrum-4 and Spectrum-5 generations.

AM

AMD

Co-developer contributing networking and NIC expertise; positioning MRC as an open alternative to proprietary AI fabrics and the entity formally contributing the spec to OCP alongside OpenAI and Microsoft.

BR

Broadcom

Supplies the Ethernet switching silicon and NICs used to deploy MRC at scale; OpenAI states its frontier-model training has run on MRC across hardware from both NVIDIA and Broadcom.

MI

Microsoft

Runs MRC in production inside its Fairwater supercomputers used to train OpenAI frontier models, and co-authored the resilient-networks engineering blog detailing how MRC behaves under failure.

OP

Open Compute Project

Receiving body for the MRC specification, which is being published under an open license so any NIC, switch, or hyperscaler vendor can implement and build on the protocol.

Source Articles

Top 5

THE SIGNAL.

Analysts

"Frames MRC's first deployment in NVIDIA's Blackwell generation as proof that the protocol works at gigascale, and credits a deep co-engineering relationship with NVIDIA: 'Deploying MRC in the Blackwell generation was very successful and was made possible by a strong collaboration with NVIDIA.'"

Sachin Katti
Head of Industrial Compute, OpenAI

"Argues the open release exists to head off industry fragmentation around private RDMA transports: 'Several players in the industry have their own in-house implementations of protocols … that type of market fragmentation is bad for the networking industry.' He frames MRC as a way to convert raw compute into faster research iteration — 'we want to use as much compute as we can get, but also we want to make sure that we're using it efficiently and effectively, and this is a critical component of that.'"

Greg Steinbrecher
Workload Lead, OpenAI

"Reads MRC as an architectural reframe rather than a tweak: 'OpenAI is treating the entire AI fabric as a single fluid system instead of a series of isolated connections.' He places it in a broader move toward specialized Ethernet-plus designs aimed squarely at tail-latency and congestion bottlenecks that bite frontier training."

Ron Westfall
Research Director, HyperFrame Research

"Sees MRC as evidence that hyperscalers are 'leaning harder into Ethernet for AI fabrics, especially as clusters push toward 100,000 to 500,000-plus GPUs.' Dell'Oro tracking shows Ethernet shipments to AI back-end networks surpassed InfiniBand in 2025, which makes a credible Ethernet-native RDMA transport like MRC strategically important to the entire ecosystem."

Sameh Boujelbene
Vice President, Dell'Oro Group

"Names the bottleneck explicitly: 'As GPUs and CPUs continue to drive compute, the real bottleneck in scaling AI is the network.' That framing is the strategic case for MRC — and for AMD's bet that an open Ethernet-RDMA transport, not a proprietary fabric, is where AI back-end networking is going."

Krishna Doddapaneni
Corporate Vice President, AMD
The Crowd

"We've partnered with @AMD, @Broadcom, @Intel, @Microsoft, and @NVIDIA, to release Multipath Reliable Connection (MRC), a new open networking protocol that helps large AI training clusters run faster and more reliably, with less wasted GPU time."

@@OpenAI0

"AI supercomputers need a new kind of network to stay in sync at massive scale. OpenAI's @markjhandley and @poyntingatgreg join @AndrewMayne to discuss what it takes to move data across record numbers of chips reliably and efficiently, the new Multipath Reliable Connection (MRC)"

@@OpenAI0

"JUST IN: OpenAI partners with AMD, Broadcom, Intel, Microsoft, and Nvidia to launch MRC - $AMD $AVGO $MSFT $NVDA. OpenAI Partnered With AMD, Broadcom, Intel, Microsoft, And NVIDIA To Launch Multipath Reliable Connection (MRC). MRC Is A New Open-Standard Protocol"

@@AIStockSavvy0

"OpenAI has partnered with AMD, Broadcom, Intel, Microsoft, and NVIDIA to develop MRC (Multipath Reliable Connection)"

@u/Raigarak25
Broadcast
Why AI needs a new kind of supercomputer network — the OpenAI Podcast Ep. 18

Why AI needs a new kind of supercomputer network — the OpenAI Podcast Ep. 18

OpenAIがNVIDIA・AMD・Microsoftと共同開発!AI訓練を劇的に速くする新技術「MRC」爆誕

OpenAIがNVIDIA・AMD・Microsoftと共同開発!AI訓練を劇的に速くする新技術「MRC」爆誕

OpenAI launches MRC, a supercomputer networking protocol | Next in AI | Astha La Vista

OpenAI launches MRC, a supercomputer networking protocol | Next in AI | Astha La Vista