Anthropic Releases Claude Fable 5, Its Most Powerful AI Yet, With Cyber Safeguards

Anthropic Releases Claude Fable 5, Its Most Powerful AI Yet, With Cyber Safeguards

The Hacker News general

Key Points:

  • Anthropic has released Claude Fable 5, its most capable AI model, as a public product with embedded safety classifiers, while its twin model, Claude Mythos 5, which has cyber capabilities unrestricted, remains limited to vetted cybersecurity professionals and critical infrastructure operators.
  • Fable 5 uses classifiers to detect and redirect potentially harmful cyber, biology, chemistry, and distillation requests to a less capable model, Claude Opus 4.8, to prevent misuse, whereas Mythos 5 retains full cyber capabilities for trusted users.
  • The safety classifiers effectively block offensive cyber tasks such as exploit development and reconnaissance, with fallback occurring in less than 5% of sessions, though some false positives remain as Anthropic plans to refine the system post-launch.
  • Anthropic's red team demonstrated that Mythos Preview, a predecessor to Mythos 5, autonomously identified and exploited zero-day vulnerabilities across major operating systems and browsers, highlighting the dual-use risk of advanced AI models in cybersecurity.
  • The rapid discovery of thousands of high-severity vulnerabilities by Anthropic and partners underscores a shift where patching and verification lag behind bug discovery, increasing the urgency for automated updates and robust defense measures, while Anthropic introduces a 30-day data retention policy to monitor for novel attacks.

Trending Business

Trending Technology

Trending Health