Anthropic fight with Trump turns into a cyber-defense fight

Security leaders say the Fable 5 and Mythos 5 restrictions risk punishing the same bug-finding work defenders need.

By ยท Published

Why it matters

The Fable 5 dispute is becoming a test case for whether U.S. AI policy treats vulnerability discovery as a defensive capability or a restricted weapon.

An abstract, allegorical struggle between a monolithic, imposing government structure and dynamic, interwoven digital defense systems. (scratchboard / woodcut)

Dario Amodei's Anthropic is now at the center of a wider fight over whether Washington is about to make defensive security research harder in the name of controlling AI-enabled hacking.

Nearly 150 security leaders have signed an open letter urging the Trump administration to reverse restrictions on Anthropic's Fable 5 and Mythos 5 models, Axios reported Tuesday. The letter was organized by Alex Stamos, the former Facebook security chief and current chief product officer at Corridor, and argues that the government has taken a model-access dispute and turned it into a precedent for what American AI systems are allowed to do for defenders.

The timing matters because Anthropic is the AI lab that has spent years asking governments to take frontier-model risk seriously. Amodei, a former OpenAI research executive who co-founded Anthropic with Daniela Amodei and other former OpenAI staffers in 2021, has argued for a formal process that would let the government block unsafe AI deployments. The company is a public benefit corporation and has built its Claude brand around safety, interpretability, and controlled release of the highest-risk capabilities. That posture is now colliding with a White House process that Anthropic says is not transparent, not technically grounded, and not applied consistently.

The fight began Friday, June 12, when the U.S. government issued an export-control directive blocking access to Fable 5 and Mythos 5 by any foreign national, including foreign-national Anthropic employees. Anthropic said the practical effect was that it had to disable both models for all customers to comply. The directive came three days after Anthropic had launched Claude Fable 5 and Claude Mythos 5, its first broad release of a Mythos-class model and a gated version aimed at cyber defenders and infrastructure providers.

Anthropic's own account is narrow: the government had only given it verbal evidence of a potential non-universal jailbreak, which Anthropic described as asking the model to read a specific codebase and fix software flaws. The company said it reviewed a report it believed formed the basis for the directive and found that the displayed capability was available in other public models and used by defenders every day. It also said no testers had found a universal jailbreak before launch and that perfect jailbreak resistance is not currently possible for any model provider, so its strategy was to make bypasses narrow or costly while monitoring abuse.

Amazon's report became the trigger

The immediate catalyst was not a public exploit. It was an Amazon escalation.

Axios reported Friday that Amazon called administration officials Thursday night with a report showing how its researchers could jailbreak and access portions of Anthropic's new Mythos-class capability. Axios said concerns about Chinese access to Mythos and a call from Amazon CEO Andy Jassy helped send the administration into a scramble after Anthropic released Fable 5 on June 9. Amazon is not a neutral bystander: its cloud arm had announced Fable 5 availability on Amazon Bedrock the same day Anthropic launched the model, and Amazon is a major Anthropic backer and distribution partner.

That makes the episode more complicated than a standard regulator-versus-startup fight. Amazon Web Services had marketed Fable 5 as making Mythos-level capabilities available to customers with safeguards, including fallbacks to Opus 4.8 for harmful prompts in cybersecurity, biology, chemistry, and health. It also said Anthropic required 30-day retention and human review for Fable 5 traffic on Bedrock because Anthropic wanted visibility into misuse patterns that a single exchange would not reveal.

Anthropic had built Fable 5 precisely as a compromise: a general-use version of Mythos with classifiers that route higher-risk requests away from the top model. Mythos 5, by contrast, was to remain available only to a smaller group of vetted customers. At launch, Anthropic said Fable 5 and Mythos 5 were priced at $10 per million input tokens and $50 per million output tokens, and that Mythos 5 would initially run through Project Glasswing in collaboration with the U.S. government.

That is the contradiction security leaders are now pointing at. If a model that helps identify software flaws is restricted because it can produce proofs of concept, the restriction hits both sides of the cyber equation. Attackers can use the same bug-finding ability. So can the teams patching the code.

The security community's complaint

The open letter is not a defense of unconstrained AI cyber capability. It states plainly that AI is reducing the difficulty of finding software flaws and writing exploits, and that Anthropic's Mythos-class models are good at both. Its central claim is narrower and more damaging to the government's rationale: the models are not uniquely good at those tasks, and many signers already use other foundation and open-source models for security audits and red-teaming.

The letter says the underlying research that triggered the action was focused on whether a human-prompted section of code was insecure. That is the core task of secure coding, not an edge-case offensive feature. The signers also argue that similar capability can be replicated on other models, including OpenAI, Anthropic's own lower-tier Claude models, and Chinese models such as Kimi 2.7.

Katie Moussouris, CEO of Luta Security, sharpened the point in a blog post after reviewing what she described as the relevant paper. She wrote that the prompt sequence was not mass autonomous exploitation, but a multistep defensive workflow that began with a blocked request to review code for security issues and moved to a request to fix code. Moussouris is not a casual critic of export controls: she helped create Microsoft's and the Pentagon's first bug bounty programs, co-authored vulnerability disclosure standards, and served on the Commerce Department's Information Systems Technical Advisory Committee.

Her analogy is Wassenaar, the international export-control arrangement whose 2013 intrusion-software language triggered a years-long fight because it swept defensive vulnerability disclosure and incident response into controls designed for offensive tooling. Her warning is that the same category mistake is now being repeated with AI models: a broad control meant to slow adversaries can end up delaying defenders.

A voluntary AI framework now looks less voluntary

The administration has a policy framework for this, at least on paper. On June 2, President Donald Trump signed an executive order on advanced AI innovation and security. The order directs agencies to establish a voluntary framework for early government engagement with developers of covered frontier models, create classified benchmarks for advanced cyber capabilities, and form an AI cybersecurity clearinghouse to coordinate vulnerability scanning, validation, remediation, and patch distribution with industry and critical infrastructure operators.

The Fable 5 directive is now testing the credibility of that framework before it is fully built. The White House fact sheet described the approach as voluntary and designed to protect innovation from unnecessary regulation. But the June 12 directive used export-control authority to force an immediate shutdown of Anthropic's highest-profile new models, without the kind of transparent technical process the open letter is demanding.

That is why this dispute has moved beyond Anthropic's product roadmap. A model recall based on a narrow jailbreak finding creates a rule that no regulator has written down: if a frontier model is meaningfully useful for vulnerability discovery, it may be too risky to release broadly, even if the same capability exists elsewhere and even if defenders rely on it.

As RuntimeWire reported June 13, Anthropic's position is that the jailbreak behind the shutdown amounted to code review, not a universal bypass that unlocked the full model. That followed the company's June 9 launch, when RuntimeWire noted that Anthropic was trying to split the difference between a broad Fable 5 release and a gated Mythos 5 cyber-access program. The split lasted less than a week.

The next decision is about incentives

The administration's incentive is straightforward: it does not want a U.S. lab to hand powerful cyber capabilities to foreign adversaries while government systems remain exposed. The security community's counter is also straightforward: adversaries are not waiting for U.S. permission, and Chinese open-weight models are advancing quickly enough that restricting American tools may asymmetrically hurt American defenders.

The sharper risk is not that Anthropic loses a few days of model availability. Fable 5 was new enough that most enterprises probably had not hard-wired it into production systems. The deeper risk is that AI labs respond rationally to the government's signal. If a model can find bugs, generate patches, and produce exploit proofs of concept, a lab may strip out or blunt those capabilities before release to avoid regulatory punishment. That would make model launches easier to defend in Washington and less useful inside security teams.

Amodei has asked for a world in which the government can stop unsafe AI deployments through a clear process. The Fable 5 fight is showing what happens when the stop button arrives before the process does.

Reader comments

Conversation for this story loads after sign-in.