The mechanics of an industrial-scale model heist

Anthropic's February disclosure is unusually specific for an AI safety document. The company says three Chinese labs - DeepSeek, MiniMax, and Moonshot AI - generated more than 16 million exchanges with Claude across roughly 24,000 fraudulent accounts, routing traffic through commercial proxies to evade Anthropic's China access restrictions [2]. The traffic was not evenly distributed: MiniMax alone accounted for about 13 million exchanges aimed at agentic coding and tool-orchestration tasks, Moonshot ran roughly 3.4 million focused on agentic reasoning and computer vision, and DeepSeek's footprint was smaller at around 150,000 exchanges but allegedly aligned with training censorship-tuned successor models [4].
The technical term Anthropic is operationalizing here is distillation - using one model's outputs as supervised training data for a smaller or differently-aligned model. What makes the disclosure striking is the operational sophistication: Anthropic claims MiniMax pivoted nearly half of its distillation traffic within 24 hours of an Anthropic model release, suggesting automated harvesting pipelines tied directly to upstream changes [2]. Anthropic frames the resulting models as a national security problem, not just an IP one, because illicitly distilled systems strip out the original safeguards - leaving capabilities like cyber operations, disinformation, and surveillance available without the refusal behaviors Claude ships with [4]. That move - from 'they copied us' to 'they copied us unsafely' - is what gives the company its bridge from a commercial grievance to a policy ask.



