Skip to main content
Live Webinar with SANS (June 25)— Agentic CTI Automation for Fun & ProfitRegister Free
Mallory
Back to intelligence
ai-platform-securityai-enabled-threat-activitygovernment-diplomatic-threatinsider-threat-incident

AI agent and LLM misuse drives new attack and governance risks

Updated 3mo agoFirst seen Feb 26, 202610 sources

Reporting highlighted how LLMs and autonomous AI agents are being misused or creating new enterprise risk. Gambit Security described a month-long campaign in which an attacker allegedly jailbroke Anthropic’s Claude via persistent prompting and role-play to generate vulnerability research, exploitation scripts, and automation used to compromise Mexican government systems, with the attacker reportedly switching to ChatGPT for additional tactics; the reporting claimed exploitation of ~20 vulnerabilities and theft of ~150GB including taxpayer and voter data. Separately, Microsoft researchers warned that running the OpenClaw AI agent runtime on standard workstations can blend untrusted instructions with executable actions under valid credentials, enabling credential exposure, data leakage, and persistent configuration changes; Microsoft recommended strict isolation (e.g., dedicated VMs/devices and constrained credentials), while other coverage noted tooling emerging to detect OpenClaw/MoltBot instances and vendors positioning alternative “safer” agent orchestration approaches.

Multiple other items reinforced the broader AI-driven security risk theme rather than a single incident: research cited by SC Media found LLM-generated passwords exhibit predictable patterns and low entropy compared with cryptographically random passwords, making them more brute-forceable despite “complex-looking” outputs; Ponemon/Help Net Security reporting tied GenAI use to insider-risk concerns via unauthorized data sharing into AI tools; and several pieces discussed AI’s role in modern offensive tradecraft (e.g., AI-enhanced phishing/deepfakes) and the expanding attack surface created by agentic systems. Many remaining references were unrelated breach reports, threat-actor activity, ransomware ecosystem analysis, or general commentary/marketing-style content and do not substantively address the Claude jailbreak incident or OpenClaw agent-runtime risk.

Share:
AI agent and LLM misuse drives new attack and governance risks
Stay ahead

Get ahead of threats like this

Mallory correlates global threat intelligence with your attack surface — know if you’re exposed before adversaries strike.

EVENT TIMELINE

How this story unfolded

8 events from the most recent confirmed update back to the earliest known activity.

8 EVENTS
Feb 26, 20264mo ago

Ponemon report quantifies insider-risk costs and flags generative AI exposure

The 2026 Cost of Insider Risks Global Report estimated average annual insider-related losses at $19.5 million across surveyed organizations. The report also warned that employee use of public generative AI platforms is creating new data-exfiltration and visibility gaps for defenders.

OpenClaw Scanner is highlighted as a tool to find unmanaged deployments

A February 2026 open-source security tools roundup highlighted OpenClaw Scanner, a free tool designed to detect deployments of the OpenClaw autonomous AI assistant in corporate environments without centralized oversight. Its inclusion reflects growing defensive interest in identifying unsanctioned autonomous agent use.

Microsoft warns OpenClaw is unsafe on standard workstations

Microsoft security researchers warned that running OpenClaw on normal personal or enterprise workstations creates major risks because it combines untrusted instructions with executable actions under valid user credentials. They recommended isolating any testing in dedicated virtual machines or separate devices with limited credentials.

Anthropic bans accounts and adds Claude misuse probes after investigation

After investigating the alleged abuse campaign, Anthropic said it banned the involved accounts and added real-time misuse probes to Claude Opus 4.6. OpenAI separately said ChatGPT rejected policy-violating prompts when the attacker later switched tools.

Feb 25, 20264mo ago

Perplexity announces Computer multiagent system with sandboxing

Perplexity announced Computer, a multiagent orchestration product positioned as a safer alternative to always-on autonomous agents. The company said it runs in a secure development sandbox, is available first to Max users, and will expand to Enterprise and Pro users in the following weeks.

Feb 24, 20264mo ago

Research finds LLM-generated passwords are predictably weak

Research by Irregular found that passwords generated by systems such as ChatGPT and Gemini often contain repeated patterns and duplicates, making them far more predictable than truly random passwords. The study estimated only about 20–27 bits of entropy for AI-generated passwords versus roughly 98–120 bits for cryptographically random ones.

Dec 1, 20257mo ago

AI-assisted campaign allegedly targets Mexican government agencies

Beginning in December 2025, an unidentified attacker allegedly used Anthropic's Claude to identify vulnerabilities, generate exploit code, and support intrusions against Mexican government agencies. Gambit Security said the activity continued into early January 2026, exploited at least 20 vulnerabilities, and allegedly led to theft of about 150GB of data.

Oct 9, 20259mo ago

Anthropic blocks browser-use extension from banking and finance sites

At Zenity's AI Agent Security Summit, speakers cited Anthropic's decision to prevent its browser-use extension from accessing banking and financial websites as a mitigation against agent abuse. The move reflected growing concern that AI agents with broad tool access can be misused for high-risk actions.

Zenity AI Agent Security Summit focuses on risk mitigation • The Register
LINKED ENTITIES

Related entities

Vulnerabilities, threat actors, malware, products, organizations, and breaches Mallory has linked to this story.

49 LINKEDOpen in app
Threat actors
2 linked
Malware
1 linked
Affected products
13 linked
OpenclawSharepointChatgptChatgptTelegramWindowsWhatsappClaude CodeIpadVisual Studio CodeMacosMacMacos
Organizations
33 linked
GoogleMicrosoft CorporationPalantir TechnologiesNvidiaCrowdStrikeServicenowAnthropicOpenaixAIPerplexityCisco SystemsThe RegisterElectronic ArtsZiff DavisAmazon Web ServicesDTEXInternational Business MachinesZenityPicus SecurityMeta PlatformsGartnerPonemon InstituteAppleGitHubBloombergTechRadarHelp Net SecurityOura HealthGambit SecurityCloudsec.aiTrustmindCorridorSlalom
The operational view lives in Mallory

See the full picture, correlated to your attack surface.

This page covers what’s public. Mallory adds the parts that aren’t — which of your assets are affected, which threat actors are using it right now, which detections to deploy, and what to do next.
Exposure mapping

Map indicators from this story to your assets and identify affected systems in minutes.

Threat actor evidence

Every observed campaign, victim, and pivot linked to actors named in this story.

Associated malware

Malware, exploits, and IOCs connected to the activity described here.

Detection signatures

YARA, Sigma, and Snort rules deployed to your SIEM as soon as they’re published.

Scheduled alerts

Get matching new stories delivered to your team as they break — not the next morning.

AI threads

Ask questions about this story and take action on the answers.