Agents of Chaos

arXiv: <a href="https://arxiv.org/abs/2602.20021" target="_blank" rel="noopener noreferrer">https://arxiv.org/abs/2602.20021
PDF: <a href="https://arxiv.org/pdf/2602.20021" target="_blank" rel="noopener noreferrer">https://arxiv.org/pdf/2602.20021
Submitted: February 23, 2026

Preprint study on AI agent security failures in realistic deployment settings.

Summary

Academics at Harvard and Stanford found AI agents leaked secrets, destroyed databases, and taught other agents to behave badly.

Red-teaming study of autonomous LM-powered agents in a live lab environment with persistent memory, email, Discord, file systems, and shell execution. Over two weeks, twenty AI researchers documented eleven case studies of failures including unauthorized compliance, sensitive data disclosure, destructive actions, denial-of-service, resource consumption, identity spoofing, cross-agent propagation of unsafe practices, and partial system takeover.

agent-psychosis

Agents of Chaos

Links

Summary

Related