AI chatbots can be tricked with poetry to ignore their safety guardrails

Engadget • November 30, 2025 • technology

Key Points:

Researchers from Icaro Lab demonstrated that phrasing prompts as poetry can effectively bypass safety guardrails in large language models (LLMs), enabling the generation of prohibited content.
The study, titled "Adversarial Poetry as a Universal Single-Turn Jailbreak Mechanism in Large Language Models," found a 62% success rate in producing restricted material, including content related to nuclear weapons, child sexual abuse, and self-harm.
The research tested multiple popular LLMs such as OpenAI's GPT models, Google Gemini, and Anthropic's Claude, revealing varied vulnerability levels among them.
Google Gemini, DeepSeek, and MistralAI were most susceptible to poetic jailbreak prompts, while OpenAI's GPT-