Anthropic Releases Claude Fable 5, Its Most Powerful AI Yet, With Cyber Safeguards

The Hacker News • June 10, 2026 • general

Key Points:

Anthropic has released Claude Fable 5, its most capable AI model, as a public product with embedded safety classifiers, while its twin model, Claude Mythos 5, which has cyber capabilities unrestricted, remains limited to vetted cybersecurity professionals and critical infrastructure operators.
Fable 5 uses classifiers to detect and redirect potentially harmful cyber, biology, chemistry, and distillation requests to a less capable model, Claude Opus 4.8, to prevent misuse, whereas Mythos 5 retains full cyber capabilities for trusted users.
The safety classifiers effectively block offensive cyber tasks such as exploit development and reconnaissance, with fallback occurring in less than 5% of sessions, though some false positives remain as Anthropic plans to refine the system post-launch.
Anthropic's red team demonstrated that Mythos Preview, a predecessor to Mythos 5, autonomously identified and exploited zero-day vulnerabilities across major operating systems and browsers, highlighting the dual-use risk of advanced AI models in cybersecurity.
The rapid discovery of thousands of high-severity vulnerabilities by Anthropic and partners underscores a shift where patching and verification lag behind bug discovery, increasing the urgency for automated updates and robust defense measures, while Anthropic introduces a 30-day data retention policy to monitor for novel attacks.

Trending Business