No Breach, Just a Conversation: How Distillation Actually Works
The striking thing about this case is what did not happen. Nobody stole Claude's weights, broke into Anthropic's servers, or leaked source code. According to the allegations, the operators simply talked to Claude - millions of times. That is the mechanism behind a 'distillation attack.' Distillation is the practice of training a smaller, cheaper model on the outputs of a larger, more capable one, using the frontier model as an effectively unpaid teacher. You feed the teacher a flood of prompts, harvest its answers, and then fine-tune your own model to imitate those answers. Done at scale, the student inherits much of the teacher's behavior without anyone ever paying for the compute and research that produced it.
Anthropic says the campaign was deliberately aimed at Claude's most commercially valuable skills: agentic reasoning, software engineering, and long-horizon planning [2]. Those are exactly the capabilities that are hardest and most expensive to train into a model from scratch, which is why harvesting them via the API is so attractive. The alleged toolkit was mundane - nearly 25,000 fraudulent accounts standing in for real users, funneling more than 28.8 million exchanges through normal API access [2]. The community seized on this distinction immediately: this was API abuse, not a heist, and that nuance sits at the center of why the accusation has been so divisive.




