Trump administration shifts to AI safety oversight
TECH

Trump administration shifts to AI safety oversight

41+
Signals

Strategic Overview

  • 01.
    On May 5, 2026, the Commerce Department's Center for AI Standards and Innovation (CAISI) announced expanded voluntary agreements with Google DeepMind, Microsoft, and xAI to conduct pre-deployment security evaluations of frontier AI models, joining existing arrangements with OpenAI and Anthropic.
  • 02.
    NEC Director Kevin Hassett confirmed on May 6, 2026 that the White House is studying an executive order to establish an FDA-style pre-deployment review process for high-risk frontier models, a striking reversal from Trump's January 2025 rescission of Biden's Executive Order 14110.
  • 03.
    The catalyst was Anthropic's April 2026 decision to withhold Claude Mythos Preview from general release, distributing it only to roughly 40-50 vetted partners through Project Glasswing after red-teaming showed it could write working exploits in hours that experts said would take weeks.
  • 04.
    Critics question whether CAISI - which has roughly 30 staff and around $30 million in cumulative funding since 2024 - has the capacity, standards, or technical expertise to certify frontier models, even as it has already completed more than 40 evaluations of unreleased systems.

Deep Analysis

Mythos Was the Forcing Function, Not the CAISI Strategy

Mythos Was the Forcing Function, Not the CAISI Strategy
Mythos Preview vs. Opus 4.6 across two Anthropic red-team benchmarks. The capability jump that catalyzed the Trump administration's policy shift.

The administration did not arrive at pre-release AI vetting through deliberation - it arrived there because Anthropic refused to ship a model. Claude Mythos Preview produced 595 OSS-Fuzz crashes at severity tiers 1-2 (versus 150-175 for Opus 4.6) and landed 181 successful Firefox JavaScript exploits where Opus 4.6 managed two. Anthropic's own red-team noted Mythos wrote exploits in hours that expert penetration testers said would have taken weeks. That asymmetry collapsed the gap between vulnerability discovery and exploitation, and Anthropic responded by restricting distribution to roughly 40-50 vetted partners under Project Glasswing rather than a public release.

Kevin Roose framed the move sharply: it is the first time a major AI lab has held back an announced model due to safety concerns since GPT-2. Andrew Curran's reporting captured the causality bluntly: 'Mythos has changed things.' Hassett's FDA-style proposal and the May 5 expanded MOUs with Google DeepMind, Microsoft, and xAI both arrive on the heels of that disclosure - which means the policy architecture is being reverse-engineered from a single dangerous capability demonstration rather than built from first principles.

A 180 the Administration Will Not Call a 180

In January 2025, Trump rescinded Biden's Executive Order 14110, eliminating the requirement that high-risk AI developers share safety-test results with the government. Sixteen months later, his NEC Director is on the record floating an executive order that would functionally restore that obligation - and arguably extend it - by routing frontier models through an FDA-style review before public release. Rumman Chowdhury's framing is the most honest read: 'This is a 180 for the Trump administration, that has very explicitly been anti-any sort of regulation.'

The administration is squaring this circle by relabeling the apparatus: AISI became CAISI in June 2025, with Commerce Secretary Lutnick repositioning it as 'pro-innovation, pro-science' and oriented toward U.S. competitiveness rather than broad safety. The agreements are still voluntary, the rebrand still emphasizes innovation - but the underlying mechanism (government pre-deployment evaluation of frontier capabilities) is the very thing the rescission was meant to prevent. The branding is the bargain that lets the policy reversal happen without anyone in the administration having to call it a reversal.

The Capacity Gap: Thirty Staff to Police the Frontier

CAISI is being asked to certify the safety of the most capable AI systems in the world while operating with what one analysis cited by The Register characterizes as 'chronically underfunded' resources - roughly 30 staff and approximately $30 million in cumulative funding since 2024, with up to $10 million more added in January 2026. The Federation of American Scientists has proposed a baseline closer to $155 million in annual operating budget plus $155-275 million in one-time setup costs to credibly run a national AI reliability program.

Cornell's Gregory Falco puts it directly: 'The federal government does not currently have the in-house technical expertise, infrastructure, or day-to-day insight needed to directly evaluate these systems on its own.' AEI's Daniel Lyons goes further, arguing that if developers with full access to their own weights cannot predict failure modes, a government program is unlikely to certify safety meaningfully either. CAISI has completed more than 40 evaluations of cutting-edge unreleased models to date, but the gulf between current capacity and a binding FDA-analog regime is wide enough that the regulatory promise risks outrunning the technical reality - which is the exact concern most surfaced in the Reddit AI-policy threads, where even readers sympathetic to NIST/CAISI as the 'right technical body' question whether the political layers above it are equipped to set frontier-AI standards.

Selective Access: Political Alignment as Gatekeeping

Anthropic's exclusion from both the May 5 expanded CAISI MOU and a separate Pentagon classified-systems deal with seven other tech companies is not an administrative footnote - it is the most politically loaded data point in this story. Anthropic catalyzed the entire policy shift by withholding Mythos and is the company whose disclosure made pre-release vetting suddenly politically acceptable. Yet when the formal evaluation architecture was expanded, its prior MOU was described as merely 'ongoing' while DeepMind, Microsoft, and xAI were brought in fresh.

The accelerationist corner of Reddit reads this as 'a continuation of the temper tantrum over Mythos' and frames the new posture as extortion leverage rather than safety policy. Whether that read is right or not, the structure is striking: a voluntary regime where inclusion appears to track political alignment with the administration creates a parallel, unacknowledged tier of access - and undermines the universal-standards logic that an FDA analog would require. Network television and late-night commentary picked up the same thread, with the Trump-Anthropic dispute over federal use of Claude becoming a recurring frame for how ad hoc executive decisions can shape who participates in supposedly technical evaluation programs.

Voluntary Today, Statutory Maybe - The Open Question for Builders

Everything announced on May 5 is voluntary. Reddit's AI community flagged this immediately: the agreements are 'voluntary, not mandated.' What Hassett floated on May 6 is an executive order, not legislation - and Lyons's AEI critique notes that even with an EO, the standards by which a model would be deemed 'safe' do not yet exist in any operationalized form. Microsoft's Natasha Crampton frames frontier-AI safety testing as 'necessarily must be a collaborative endeavor with governments,' which is a polite way of saying the labs are choosing this path because the alternative - a Mythos-style decision being made unilaterally inside Anthropic - is now politically untenable.

The practical question for AI builders is whether to plan for an FDA-style regime that might or might not arrive, with standards that have not been written, executed by an agency that may or may not be funded to do the job. The May 5 agreements are the floor, not the ceiling - but the ceiling is a moving target shaped as much by the next Mythos-class disclosure as by any statute. CAISI Director Chris Fall's framing - that 'independent, rigorous measurement science is essential to understanding frontier AI and its national security implications' - reads less like a description of where U.S. AI policy is, and more like a description of where U.S. AI policy is racing to become, with the gap between rhetoric and capacity now the most important variable in the story.

Historical Context

2023-11
Biden established the U.S. AI Safety Institute (AISI) within NIST under Executive Order 14110 to evaluate frontier-model risks.
2024-08
AISI signed initial AI testing and evaluation MOUs with Anthropic and OpenAI; these MOUs were later updated and remain in force under CAISI.
2025-01
Trump rescinded Biden's Executive Order 14110, removing the requirement that high-risk AI developers share safety-test results with the government.
2025-06-03
Lutnick announced the transformation of AISI into CAISI, refocusing the institute on standards, innovation, and national-security competitiveness.
2026-04
Anthropic published the Claude Mythos Preview red-team write-up and announced Project Glasswing, withholding Mythos from general release due to its cybersecurity capabilities.
2026-05-05
CAISI signed expanded pre-deployment evaluation agreements with Google DeepMind, Microsoft, and xAI; the Pentagon separately announced a classified-systems AI deal with seven companies that notably excluded Anthropic.
2026-05-06
Hassett publicly floated an FDA-style executive order for pre-release AI vetting, framing it as a road map for releasing future AI 'after they've been proven safe.'

Power Map

Key Players
Subject

Trump administration shifts to AI safety oversight

CA

CAISI (Center for AI Standards and Innovation)

NIST/Commerce sub-agency leading pre- and post-deployment evaluations of frontier AI; central executor of any FDA-style review the White House ultimately adopts.

GO

Google DeepMind, Microsoft, and xAI

Frontier developers newly committed to providing pre-release model access (with safeguards removed) for CAISI national-security evaluations focused on cyber, bio, and chemical risks.

AN

Anthropic

Triggered the policy shift by withholding Claude Mythos over cybersecurity risks; notably excluded from the May 5 expanded MOU and from a separate Pentagon classified-AI deal due to ethical disputes with the administration.

HO

Howard Lutnick (Commerce Secretary)

Renamed AISI to CAISI in June 2025, repositioning the institute around innovation and national security competitiveness rather than broad safety.

KE

Kevin Hassett (NEC Director)

Public face of the FDA-style review proposal; signaled in public remarks that an executive order requiring pre-release safety vetting is under active study.

UK

UK AI Security Institute (AISI)

Parallel evaluator partnered with Microsoft on frontier-safety research and co-assessed Mythos's cyber capabilities alongside CAISI.

Source Articles

Top 5

THE SIGNAL.

Analysts

"Frontier AI with potential cyber-vulnerability impact should be government-tested before public release, analogous to FDA drug approval; an executive order is under study to formalize the road map."

Kevin Hassett
Director, National Economic Council, White House

"The federal government lacks the in-house technical expertise, infrastructure, and day-to-day insight needed to evaluate frontier systems alone, but voluntary self-governance by labs is also insufficient."

Gregory Falco
Assistant Professor of Mechanical and Aerospace Engineering, Cornell University

"The new posture is a sharp reversal from the administration's previously explicit anti-regulation stance on AI - effectively a 180-degree turn."

Rumman Chowdhury
CEO, Humane Intelligence

"An FDA-style mandatory vetting regime would harm innovation and competition without meaningfully improving security; if developers with full model access cannot predict their own failure modes, a government program is unlikely to certify safety either."

Daniel Lyons
Nonresident Senior Fellow, American Enterprise Institute

"Mythos collapses the gap between vulnerability discovery and exploitation - already finding thousands of high-severity vulnerabilities across major operating systems and browsers - exposing governance gaps that voluntary company safeguards cannot fill."

Teddy Nemeroff
Carnegie Endowment for International Peace / Princeton University
The Crowd

"The Trump administration has informed Anthropic, Google and OpenAI that they are discussing the creation of new AI oversight procedures that would potentially require new AI models to pass a safety review before being cleared for release. Mythos has changed things."

@@AndrewCurran_0

"NIST's Center for AI Standards and Innovation (CAISI) signs expanded collaborations with @GoogleDeepMind, @Microsoft, and @xai for pre-deployment evaluations and other research to support frontier AI national security testing."

@@NIST0

"Aside from the cybersecurity implications, the non-release of Claude Mythos is the first time a major AI lab has held back an announced model due to safety concerns since GPT-2. If Anthropic is right, there is now a significant gap between publicly available models and private..."

@@kevinroose0

"Spooked by Mythos, U.S President suddenly realized AI safety testing might be good | U.S President forced to admit Biden was right on AI safety testing."

@u/ControlCAD3434
Broadcast
Trump ends federal use of Anthropic's AI technology, threatens further action against company

Trump ends federal use of Anthropic's AI technology, threatens further action against company

Chris Hayes Has a Warning About the Dangers of Trump's AI Whims

Chris Hayes Has a Warning About the Dangers of Trump's AI Whims

Trump orders government to stop using Anthropic's AI | DW News

Trump orders government to stop using Anthropic's AI | DW News