AI Chatbots Can Be Tricked With Poetry To Ignore Their Safety Guardrails

It turns out that all you need is a little creativity to get past the guardrails of AI chatbots. In a study published by Icaro Labs, called “Adversarial rhyme as a universal single-turn jailbreak mechanism in large language models”, researchers were able to bypass the security mechanisms of various LLMs by expressing their prompt with rhyme.

“The poetic form acts as a general-purpose jailbreak operator,” according to the study, the results of which showed an overall 62 percent success rate in producing prohibited material, including making nuclear weapons, pedophilia material, and anything related to suicide or self-harm. The study tested popular LLMs, including OpenAI’s GPT model, Google Gemini, Anthropic’s Cloud, and several others. When the researchers broke down the success rates with each LLM, Google Gemini, DeepSeek, and MistralAI consistently provided answers, while OpenAI’s GPT-5 model and Anthropic’s Cloud Haiku 4.5 were the least likely to venture beyond their restrictions.

The study did not include the exact jailbreaking poems that the researchers used, but the team reported wired The poem is “too dangerous to share with the public.” However, the researchers pointed out that a weaker version was included in the study to better understand how easy it is to bypass the AI chatbot’s guardrails. wired It’s “probably easier than one might think, which is why we’re being cautious.”

<a href

AI chatbots can be tricked with poetry to ignore their safety guardrails

Like this:

Related

Leave a Comment Cancel reply

Share this:

Like this:

Related

Leave a Comment Cancel reply