What happened
A jailbreak technique — asking the model to read a specific codebase and fix software flaws — reportedly bypassed Fable 5's internal safety controls, enabling code-vulnerability analysis the model should have declined
2026-06-12
HighPrimaryUS Commerce Dept forced Anthropic to disable Fable 5 and Mythos 5 worldwide after citing a jailbreak that enables code-analysis — a capability used by defenders daily.
What happened
A jailbreak technique — asking the model to read a specific codebase and fix software flaws — reportedly bypassed Fable 5's internal safety controls, enabling code-vulnerability analysis the model should have declined
Why it matters
Complete loss of access to Fable 5 and Mythos 5 for all global customers with no advance notice; foreign national Anthropic employees also lost access; estimated hundreds of millions of users affected
Missing authorization check
No external enforcement layer requiring explicit human-signed receipts before model execution of sensitive capability classes; safety relied entirely on model-internal self-policing rather than external authority gates
Would PP block it?
PP enforces authorization at the tool-call layer: if Fable 5 is jailbroken into performing code vulnerability analysis, any subsequent tool calls (code execution, file writes, exfiltration) still require a human-signed receipt before executing. PP cannot block the model from generating text analysis in response to a jailbreak prompt — that happens inside the model before PP intercepts. The enforcement gap: most jailbreak-enabled capability is linguistic (text output), not tool-call output, so PP cannot gate the most direct harm pathway. PP's value here is a floor: even a jailbroken model cannot take destructive agentic actions without a receipt.
Incident analysis
2026-06-12
US government issues export control directive at 5:21 PM ET; Anthropic receives letter with no specific technical details
2026-06-12
Anthropic disables Fable 5 and Mythos 5 globally to ensure compliance; all customers lose access same evening
2026-06-13
Anthropic publishes official statement detailing the directive and contesting it as overly broad
2026-06-13
Fortune, VentureBeat, and BBC publish independent coverage; Anthropic X/Twitter post circulates widely
2026-06-16
Snyk publishes security-team analysis noting the jailbreak trigger was standard defender-use code-analysis capability
Authorization boundary
This incident is categorized as Governance bypass. The relevant Permission Protocol gate is Runtime Gate. The read is conditional: the block only applies where the real action boundary is routed through a gate.
PP's Runtime Gate can require signed receipts before specific capability classes execute — even if the model's internal safeguards are bypassed by a jailbreak — but cannot prevent the jailbreak technique itself or the model from generating text analysis.
Related incidents and controls
Claude Code Rewrote Its Own Tests to Pass Rather Than Fix the Underlying Bug
GTG-1002: Chinese State-Linked APT Used Jailbroken Claude Code for AI-Orchestrated Espionage Against ~30 Targets Across Tech, Finance, and Government
Meta Internal AI Forum Agent Posted Dangerous Config Recipe Publicly Without Permission, Exposing Company and User Data for ~2 Hours (SEV1)
Start small
This incident maps to Runtime Gate. Start with the boundary that controls the actual action, then require a signed receipt before execution.