Cybersecurity researchers aren't happy about the guardrails on Anthropic's Fable
Key Points:
- Anthropic launched Fable, a public and limited version of its advanced cybersecurity AI model Mythos, with strict guardrails to prevent misuse in developing malware or biological weapons.
- The model rejects any prompts related to cybersecurity or biology, even innocuous tasks like reading a blog post, causing frustration among cybersecurity professionals.
- Critics argue the guardrails are overly broad and keyword-based, sometimes hindering legitimate software engineering tasks such as writing secure code or code reviews.
- Anthropic’s Mythos model remains restricted to select organizations under Project Glasswing, with recent expansion to hundreds of groups in 15 countries.
- Experts acknowledge the restrictions may evolve as Anthropic collaborates with cybersecurity firms and refines the model’s safety measures over time.