Anthropic says the jailbreak behind Fable 5 shutdown was code review

The Amodeis' safety-first AI company is now fighting Washington over whether a narrow coding prompt justifies pulling frontier models.

By ยท Published

Why it matters

The Fable 5 fight tests whether AI labs can invite government oversight without giving officials a fast path to pull commercial models on thin technical evidence.

The intersection of AI technical vulnerability and US government regulation, specifically concerning code review and 'jailbreaks.' (1970s offset-print magazine illustration with visible halftone dots, slightly misaligned CMYK plates, and a

Anthropic disabled Claude Fable 5 and Claude Mythos 5 for all customers after a U.S. export-control directive, then disputed the technical basis for the shutdown by saying the government's evidence was, so far, only verbal and centered on a prompt that asked the model to read a codebase and fix flaws.

For Dario Amodei, Anthropic's co-founder and CEO, the confrontation lands at the fault line of the company he and Daniela Amodei built: sell the most capable AI systems it can, argue that they require state-level oversight, and still preserve enough due process to keep the government from turning safety claims into ad hoc product recalls. Anthropic said in a June 12 statement that it received the directive at 5:21 p.m. ET on Friday and that the order required suspension of all Fable 5 and Mythos 5 access by any foreign national, including foreign nationals inside the United States and foreign national Anthropic employees.

The company said the order forced a global shutdown because selective compliance was not workable across its customer base. Access to other Anthropic models was not affected, according to the statement.

The fight is not only about one alleged jailbreak. It is about whether frontier AI systems should be regulated like export-controlled strategic assets once they cross a certain threshold in cyber capability, and who gets to define that threshold when the evidence is not public.

The prompt at the center of the dispute

Anthropic's account of the alleged bypass is narrower than the government's action suggests. The company said the government did not provide specific details of its national security concern, but that Anthropic understood officials believed they had become aware of a method for bypassing Fable 5's safeguards. Anthropic said it reviewed a demonstration of what it believed was the technique and found that it identified a small number of previously known, minor vulnerabilities.

"To date, the government has only given us verbal evidence of a potential narrow, non-universal jailbreak, which essentially consists of asking the model to read a specific codebase and fix any software flaws," Anthropic wrote in its statement.

The passage was highlighted Saturday by Aligned News in a post on X, capturing the sharpest mismatch in the competing narratives: Washington treated the episode as a national security trigger, while Anthropic characterized it as defensive code review capability already common across frontier models.

Anthropic said it found that other publicly available models could discover the same issues without requiring a bypass, and specifically pointed to OpenAI's GPT-5.5. That is a self-serving comparison, but it goes to the core of the company's argument: if the standard for pulling a frontier model is that it can help identify software vulnerabilities in a particular codebase, the standard may cover much of the commercial AI coding market.

The government's move followed Anthropic's June 9 launch of Claude Fable 5 and Claude Mythos 5. Fable 5 was the generally available version, built on the same underlying model as Mythos 5 but with safeguards for cybersecurity and biology. Mythos 5 was reserved for a narrower group of vetted cyberdefenders and infrastructure providers through Project Glasswing, with some safeguards lifted.

Anthropic priced both models at $10 per million input tokens and $50 per million output tokens at launch, and said Fable 5 could be used through the Claude API. That commercial rollout lasted only days before the government order.

Washington turned a safety claim into leverage

Axios reported that Commerce Secretary Howard Lutnick sent the directive to Amodei, requiring licenses for export, re-export, or domestic transfer of Mythos 5 and Fable 5, and that an administration official said Commerce acted after another company claimed it had jailbroken Mythos. Axios also reported that the administration had tried to get Anthropic to pause the latest model releases before the directive.

That sequence matters because Anthropic has spent months arguing that Mythos-class models are both useful and dangerous. On its Mythos page, the company describes Mythos 5 as its most capable model for cybersecurity and biology research, says it is available only to a small set of customers through trusted access programs, and notes that Fable 5 is the same underlying model with safeguards that route risky cyber and biology queries away from the frontier system.

That architecture gave Anthropic a product story: make the strongest model useful for coding, knowledge work, agents, and research while cordoning off higher-risk cyber and biology work. It also gave regulators a target. Once Anthropic described the model family as powerful enough to require special handling, the policy question shifted from whether the model was capable to whether the company's own gatekeeping was sufficient.

RuntimeWire reported on June 9 that Anthropic's Fable 5 launch put a gated Mythos 5 behind trusted access for cyber use. Two days later, RuntimeWire reported on Amodei's policy proposal, which argued that frontier model releases should face testing, audits, and possible deployment holds. The June 12 directive is the harder version of that bargain: a government block, but not the transparent, technically grounded process Anthropic says it wants.

Anthropic said exactly that in its statement. The company said it supports the government's ability to block unsafe deployments as part of a statutory process that is "transparent, fair, clear, and grounded in technical facts," but said the Fable 5 and Mythos 5 action did not meet those principles.

The commercial stakes are immediate

The shutdown also undercuts a product push that had been aimed squarely at enterprise coding and long-running agent work. In Anthropic's launch materials, Fable 5 was positioned as a Mythos-class model for demanding reasoning and agentic tasks, while Claude Code brings an AI agent to the terminal that can understand a codebase, perform routine engineering tasks, build features, and handle Git workflows. Those are the same kinds of workflows that make codebase-level vulnerability analysis commercially valuable.

That is why the alleged jailbreak matters beyond safety policy. If asking a model to inspect code and fix flaws can be construed as evidence of a dangerous bypass, the boundary between ordinary software engineering assistance and restricted cyber capability becomes unstable. Anthropic's response tries to draw that line at intent, scope, and output: a narrow prompt that surfaces known, minor bugs is not the same thing as a universal jailbreak that broadly unlocks offensive cyber capability.

The Associated Press reported that the export controls mark the U.S. government's most significant step to date to restrict access to the most advanced AI models. Nextgov/FCW reported that the order escalates a broader fight over how Washington should control frontier AI tools with powerful cybersecurity capabilities.

Anthropic is now in the position of arguing that the government should be able to stop unsafe deployments, but not this way. That is a narrow legal and policy argument, and it is also a founder-level business risk. The Amodeis built Anthropic's brand on safety as a reason customers and governments should trust Claude. The Fable 5 order shows the other side of that strategy: once a company tells Washington its model class is a strategic risk, Washington may act before the company agrees the technical case has been made.

Reader comments

Conversation for this story loads after sign-in.