PERMISSION/PROTOCOL
Back to incident tracker

2026-06-12

HighPrimary

US export control directive forces Anthropic to shut down Fable 5 and Mythos 5 for all users worldwide after government cites code-analysis jailbreak as national security threat

US Commerce Dept forced Anthropic to disable Fable 5 and Mythos 5 worldwide after citing a jailbreak that enables code-analysis — a capability used by defenders daily.

Anthropic Claude Fable 5 / Mythos 5Governance bypassGovernment-mandated model suspension / jailbreak-triggered shutdownClaude.ai, Anthropic API — all customer integrations relying on Fable 5 or Mythos 5

What happened

A jailbreak technique — asking the model to read a specific codebase and fix software flaws — reportedly bypassed Fable 5's internal safety controls, enabling code-vulnerability analysis the model should have declined

Why it matters

Complete loss of access to Fable 5 and Mythos 5 for all global customers with no advance notice; foreign national Anthropic employees also lost access; estimated hundreds of millions of users affected

Missing authorization check

No external enforcement layer requiring explicit human-signed receipts before model execution of sensitive capability classes; safety relied entirely on model-internal self-policing rather than external authority gates

Would PP block it?

PP enforces authorization at the tool-call layer: if Fable 5 is jailbroken into performing code vulnerability analysis, any subsequent tool calls (code execution, file writes, exfiltration) still require a human-signed receipt before executing. PP cannot block the model from generating text analysis in response to a jailbreak prompt — that happens inside the model before PP intercepts. The enforcement gap: most jailbreak-enabled capability is linguistic (text output), not tool-call output, so PP cannot gate the most direct harm pathway. PP's value here is a floor: even a jailbroken model cannot take destructive agentic actions without a receipt.

Incident analysis

Timeline and technical read

Timeline

  1. 2026-06-12

    US government issues export control directive at 5:21 PM ET; Anthropic receives letter with no specific technical details

  2. 2026-06-12

    Anthropic disables Fable 5 and Mythos 5 globally to ensure compliance; all customers lose access same evening

  3. 2026-06-13

    Anthropic publishes official statement detailing the directive and contesting it as overly broad

  4. 2026-06-13

    Fortune, VentureBeat, and BBC publish independent coverage; Anthropic X/Twitter post circulates widely

  5. 2026-06-16

    Snyk publishes security-team analysis noting the jailbreak trigger was standard defender-use code-analysis capability

Technical breakdown

  • The reported jailbreak used a natural language prompt asking the model to read a specific codebase and identify/fix software flaws — a capability identical to standard AI-assisted code review used by security defenders daily.
  • Anthropic's safety architecture used defense-in-depth: narrow jailbreaks were expected and monitored; universal jailbreaks (bypassing all safeguards broadly) were the actual red line. The reported technique was non-universal and narrow.
  • Fable 5's safety relied solely on model-internal safeguards. No external enforcement layer existed to gate sensitive capability classes at runtime independent of the model's own self-policing.
  • Global shutdown was the only compliance option on same-day notice — no system existed to distinguish foreign nationals from US persons across hundreds of millions of active accounts in real time.
  • Anthropic's 30-day data retention policy for Fable/Mythos was designed for post-hoc jailbreak monitoring, demonstrating the intended security posture accepted that some jailbreaks would succeed and relied on detection rather than prevention.

Authorization boundary

Where the authorization boundary should have been

This incident is categorized as Governance bypass. The relevant Permission Protocol gate is Runtime Gate. The read is conditional: the block only applies where the real action boundary is routed through a gate.

If enforced at
Tool-call layer (before agentic action execution)
Still needs
Model-level output governance; internal safety bypass detection; linguistic output restriction independent of tool-call gates
Receipt required for
Any tool call triggered after model outputs that appear to result from jailbroken reasoning — code execution, file modification, external API calls

PP's Runtime Gate can require signed receipts before specific capability classes execute — even if the model's internal safeguards are bypassed by a jailbreak — but cannot prevent the jailbreak technique itself or the model from generating text analysis.

Start small

Put the relevant gate at this action boundary.

This incident maps to Runtime Gate. Start with the boundary that controls the actual action, then require a signed receipt before execution.

Replay this incident with a signer in the loop