AI agent and LLM misuse drives new attack and governance risks
Reporting highlighted how LLMs and autonomous AI agents are being misused or creating new enterprise risk. Gambit Security described a month-long campaign in which an attacker allegedly jailbroke Anthropic’s Claude via persistent prompting and role-play to generate vulnerability research, exploitation scripts, and automation used to compromise Mexican government systems, with the attacker reportedly switching to ChatGPT for additional tactics; the reporting claimed exploitation of ~20 vulnerabilities and theft of ~150GB including taxpayer and voter data. Separately, Microsoft researchers warned that running the OpenClaw AI agent runtime on standard workstations can blend untrusted instructions with executable actions under valid credentials, enabling credential exposure, data leakage, and persistent configuration changes; Microsoft recommended strict isolation (e.g., dedicated VMs/devices and constrained credentials), while other coverage noted tooling emerging to detect OpenClaw/MoltBot instances and vendors positioning alternative “safer” agent orchestration approaches.
Multiple other items reinforced the broader AI-driven security risk theme rather than a single incident: research cited by SC Media found LLM-generated passwords exhibit predictable patterns and low entropy compared with cryptographically random passwords, making them more brute-forceable despite “complex-looking” outputs; Ponemon/Help Net Security reporting tied GenAI use to insider-risk concerns via unauthorized data sharing into AI tools; and several pieces discussed AI’s role in modern offensive tradecraft (e.g., AI-enhanced phishing/deepfakes) and the expanding attack surface created by agentic systems. Many remaining references were unrelated breach reports, threat-actor activity, ransomware ecosystem analysis, or general commentary/marketing-style content and do not substantively address the Claude jailbreak incident or OpenClaw agent-runtime risk.

Get ahead of threats like this
Mallory correlates global threat intelligence with your attack surface — know if you’re exposed before adversaries strike.
How this story unfolded
8 events from the most recent confirmed update back to the earliest known activity.
Ponemon report quantifies insider-risk costs and flags generative AI exposure
The 2026 Cost of Insider Risks Global Report estimated average annual insider-related losses at $19.5 million across surveyed organizations. The report also warned that employee use of public generative AI platforms is creating new data-exfiltration and visibility gaps for defenders.
OpenClaw Scanner is highlighted as a tool to find unmanaged deployments
A February 2026 open-source security tools roundup highlighted OpenClaw Scanner, a free tool designed to detect deployments of the OpenClaw autonomous AI assistant in corporate environments without centralized oversight. Its inclusion reflects growing defensive interest in identifying unsanctioned autonomous agent use.
Microsoft warns OpenClaw is unsafe on standard workstations
Microsoft security researchers warned that running OpenClaw on normal personal or enterprise workstations creates major risks because it combines untrusted instructions with executable actions under valid user credentials. They recommended isolating any testing in dedicated virtual machines or separate devices with limited credentials.
Anthropic bans accounts and adds Claude misuse probes after investigation
After investigating the alleged abuse campaign, Anthropic said it banned the involved accounts and added real-time misuse probes to Claude Opus 4.6. OpenAI separately said ChatGPT rejected policy-violating prompts when the attacker later switched tools.
Perplexity announces Computer multiagent system with sandboxing
Perplexity announced Computer, a multiagent orchestration product positioned as a safer alternative to always-on autonomous agents. The company said it runs in a secure development sandbox, is available first to Max users, and will expand to Enterprise and Pro users in the following weeks.
Research finds LLM-generated passwords are predictably weak
Research by Irregular found that passwords generated by systems such as ChatGPT and Gemini often contain repeated patterns and duplicates, making them far more predictable than truly random passwords. The study estimated only about 20–27 bits of entropy for AI-generated passwords versus roughly 98–120 bits for cryptographically random ones.
AI-assisted campaign allegedly targets Mexican government agencies
Beginning in December 2025, an unidentified attacker allegedly used Anthropic's Claude to identify vulnerabilities, generate exploit code, and support intrusions against Mexican government agencies. Gambit Security said the activity continued into early January 2026, exploited at least 20 vulnerabilities, and allegedly led to theft of about 150GB of data.
Anthropic blocks browser-use extension from banking and finance sites
At Zenity's AI Agent Security Summit, speakers cited Anthropic's decision to prevent its browser-use extension from accessing banking and financial websites as a mitigation against agent abuse. The move reflected growing concern that AI agents with broad tool access can be misused for high-risk actions.
Related entities
Vulnerabilities, threat actors, malware, products, organizations, and breaches Mallory has linked to this story.
Sources
10 references tracked. Mallory keeps watching after this page renders.
AI-Native Security Is a Must to Counter AI-Based Attacks
darkreading.com
Open sourceHacker Jailbreakes Claude AI to Write Exploit Code and Steal Government Data
cybersecuritynews.com
Open sourceMicrosoft warns of OpenClaw risks on standard workstations | SC Media
scworld.com
Open sourceHottest cybersecurity open-source tools of the month: February 2026 - Help Net Security
helpnetsecurity.com
Open sourceIs Perplexity's new Computer a safer version of OpenClaw? How it works | ZDNET
zdnet.com
Open sourceEp. 47 - APT42 & Iran's AI Social Engineering: Deepfakes, Phishing & Hack-and-Leak | SecuritySenses
securitysenses.com
Open sourceAI-generated passwords pose security risks due to predictable patterns | SC Media
scworld.com
Open sourceZenity AI Agent Security Summit focuses on risk mitigation • The Register
theregister.com
Open sourceSee the full picture, correlated to your attack surface.
Map indicators from this story to your assets and identify affected systems in minutes.
Every observed campaign, victim, and pivot linked to actors named in this story.
Malware, exploits, and IOCs connected to the activity described here.
YARA, Sigma, and Snort rules deployed to your SIEM as soon as they’re published.
Get matching new stories delivered to your team as they break — not the next morning.
Ask questions about this story and take action on the answers.


