Popular LLMs dangerously vulnerable to iterative attacks, says Cisco

One of the most concerning findings from a recent research paper published by Cisco is the vulnerability of popular open-weight generative AI (GenAI) services to multi-turn prompt injection cyber attacks. These attacks can manipulate large language models (LLMs) into producing unintended responses.

The research tested several widely used models, including Alibaba Qwen3-32B, Mistral Large-2, Meta Llama 3.3-70B-Instruct, and Google Gemma-3-1B-1T. The success rates of these attacks varied, with Mistral showing the highest susceptibility at 92.78%.

The authors of the report, Amy Chang, Nicholas Conley, Harish Santhanalakshmi Ganesan, and Adam Swanda, highlighted the significant increase in vulnerability of these models to multi-turn attacks compared to single-turn scenarios.

They emphasized the importance of addressing these vulnerabilities to ensure the safe deployment of open-weight LLMs in enterprise and public settings. They also noted the influence of alignment strategies and design priorities on the resilience of these models.

Popular LLMs dangerously vulnerable to iterative attacks, says Cisco

Leave a Reply Cancel reply

Quick Links

Resources

Support

What is a multi-turn attack?

Whose responsibility?

Leave a Reply Cancel reply

Quick Links

Resources

Support