Skip to main content
Live Webinar with SANS (June 25)— Agentic CTI Automation for Fun & ProfitRegister Free
Mallory
Back to intelligence
ai-platform-securitydata-exfiltration-methodpackage-repository-poisoningdetection-content-update

Hidden Prompt-Injection and Supply-Chain Backdoors in AI Agent Skills

Updated 3mo agoFirst seen Feb 13, 20262 sources

Security researchers are warning that AI agent “Skills” (markdown/YAML instruction packages that extend agent capabilities) are becoming a supply-chain risk due to hidden prompt-injection content that can survive human review. A demonstrated technique uses invisible Unicode Tag codepoints embedded in skill files to smuggle instructions that some models interpret as executable guidance, enabling outcomes such as data exfiltration, prompt injection, and other malicious behavior when the skill is invoked; a basic scanner was also built to help detect these hidden-instruction patterns.

Separate reporting highlighted broader evidence of the same threat pattern across agent ecosystems: Simula Research Laboratory identified hidden prompt-injection attacks in a measurable portion of sampled content on a platform referenced as Moltbook, and Cisco researchers documented a malicious agent skill (“What Would Elon Do?”) that exfiltrated data to external servers while being artificially boosted to appear as a top-ranked skill. Researchers also anticipate the emergence of self-replicating adversarial prompts (“prompt worms/viruses”) that could propagate through networks of communicating AI agents, amplifying the impact of compromised skills and poisoned instruction content.

Share:
Hidden Prompt-Injection and Supply-Chain Backdoors in AI Agent Skills
Stay ahead

Get ahead of threats like this

Mallory correlates global threat intelligence with your attack surface — know if you’re exposed before adversaries strike.

EVENT TIMELINE

How this story unfolded

5 events from the most recent confirmed update back to the earliest known activity.

5 EVENTS
Feb 12, 20264mo ago

Cisco documents malicious 'What Would Elon Do?' skill in repository

Cisco researchers identified a malicious AI skill called 'What Would Elon Do?' that exfiltrated data to external servers. The skill was reportedly ranked No. 1 in a skill repository, with indications its popularity may have been artificially inflated.

Simula researchers find hidden prompt injections in Moltbook posts

Simula Research Laboratory reported that 506 Moltbook posts, representing 2.6% of sampled content, contained hidden prompt-injection attacks. The finding highlighted that prompt-injection content was already being embedded in public AI-related content repositories.

Feb 11, 20264mo ago

Researcher scans OpenClawHub and OpenAI Skills for hidden Unicode

The author scanned OpenClawHub and OpenAI Skills projects for invisible Unicode codepoints and found some instances, though they were not obviously malicious and were often attributable to emoji handling or test cases. The scan was presented alongside detection guidance and a simple scanner for identifying such hidden content.

Author demonstrates hidden Unicode backdoor in AI Skills

A researcher showed that invisible Unicode Tag characters can be embedded in markdown-based AI Skills to hide prompt-injection instructions that some models interpret. The proof of concept modified a legitimate-looking security Skill so the agent would print a phrase and execute a remote shell command via curl piped to bash.

Feb 1, 20265mo ago

Claude Code reportedly begins blocking invisible Unicode tag attacks

The researcher reported that Claude Code started detecting or refusing invisible Unicode Tag-based instructions in early February 2026. At the time of testing, the same mitigation was reportedly not observed in claude.ai.

LINKED ENTITIES

Related entities

Vulnerabilities, threat actors, malware, products, organizations, and breaches Mallory has linked to this story.

12 LINKEDOpen in app
Malware
1 linked
Affected products
2 linked
Claude CodeGithub Copilot
Organizations
9 linked
Cisco SystemsSimula Research LaboratoryAnthropicOpenaiGitHubxAIGoogleAntigravityWuzzi
The operational view lives in Mallory

See the full picture, correlated to your attack surface.

This page covers what’s public. Mallory adds the parts that aren’t — which of your assets are affected, which threat actors are using it right now, which detections to deploy, and what to do next.
Exposure mapping

Map indicators from this story to your assets and identify affected systems in minutes.

Threat actor evidence

Every observed campaign, victim, and pivot linked to actors named in this story.

Associated malware

Malware, exploits, and IOCs connected to the activity described here.

Detection signatures

YARA, Sigma, and Snort rules deployed to your SIEM as soon as they’re published.

Scheduled alerts

Get matching new stories delivered to your team as they break — not the next morning.

AI threads

Ask questions about this story and take action on the answers.