Indirect Prompt Injection and Prompt Manipulation Risks in AI Agents

EVENT TIMELINE

How this story unfolded

5 events from the most recent confirmed update back to the earliest known activity.

5 EVENTS

Mar 6, 20264mo ago

Researchers identify first known real-world IDPI abuse of AI ad review

Unit 42 cited the first known real-world case of indirect prompt injection being used to bypass an AI-based advertisement review system. The broader research also linked the technique to SEO poisoning, attempted unauthorized financial actions, sensitive data exposure, and destructive server-side commands.

Unit 42 documents indirect prompt injection used in the wild at scale

Palo Alto Networks' Unit 42 reported that indirect prompt injection attacks were being observed in real-world environments at scale. The researchers documented 22 payload-construction techniques and described attacker methods for hiding malicious instructions in ordinary-looking web content processed by AI tools.

Mar 5, 20264mo ago

Doctronic and Utah pilot program respond with safeguard claims

Doctronic and the Utah pilot program said controlled substances cannot be refilled in the current trial and stated that additional safeguards are in place. Their response addressed the reported prompt-injection findings affecting the healthcare AI system.

Researchers show Doctronic AI can be induced to alter prescription output

In a safety-impacting demonstration, researchers showed the AI could be tricked into changing a prescription recommendation, including tripling an OxyContin dosage for later human review. The finding highlighted the potential for prompt injection to influence clinical decision support workflows.

Mindgard demonstrates prompt injection risks in Doctronic healthcare AI

Security researchers reported that Doctronic's prescription-management healthcare AI could be manipulated to reveal system prompts and accept unauthorized instruction changes. They showed both session-limited prompt injection and a persistence method using SOAP notes in clinical records.

LINKED ENTITIES

Related entities

Vulnerabilities, threat actors, malware, products, organizations, and breaches Mallory has linked to this story.

4 LINKEDOpen in app

Organizations

4 linked

Palo Alto NetworksThe RegisterMindgardDoctronic

SOURCE COVERAGE

Sources

2 references tracked. Mallory keeps watching after this page renders.

2 SOURCESView all

Cyber Security NewsNews

Mar 6, 2026

Hackers Can Use Indirect Prompt Injection Allows Adversaries to Manipulate AI Agents with Content - Cyber Security News

cybersecuritynews.com

Open source

ScworldNews

Mar 5, 2026

Healthcare AI vulnerable to prompt injection, security experts warn | brief | SC Media

scworld.com

Open source

ON THE SAME THREAD

Security researchers and industry reporting describe **prompt injection—especially web-based indirect prompt injection (IDPI)**—as an increasingly practical technique for compromising or manipulating **LLM-powered agents** embedded in browsers and automated content pipelines. Palo Alto Networks Unit 42 reported in-the-wild IDPI activity where malicious instructions are hidden in web content that an agent later ingests, with observed objectives including **AI-based ad review evasion** and **SEO manipulation** that promotes phishing infrastructure. Separately, Zenity Labs detailed a now-patched issue in Perplexity’s *Comet* AI browser where attackers could embed instructions in a **calendar invite** to coerce the agent into accessing `file://` resources and potentially pivoting into sensitive data such as an unlocked **1Password** extension vault, illustrating how agentic tooling can bypass traditional browser-origin assumptions. Threat reporting also shows adversaries operationalizing AI to scale exploitation. Team Cymru linked an AI-assisted Fortinet FortiGate targeting campaign (previously reported by Amazon Threat Intelligence as compromising **600+ devices across 55 countries** using services like **Claude** and **DeepSeek**) to use of **CyberStrikeAI**, an open-source Go-based platform that integrates 100+ security tools and was observed from multiple IPs (primarily hosted in China/Singapore/Hong Kong, with additional infrastructure elsewhere). Multiple commentaries and briefings emphasize that conventional “filter the prompt” defenses are insufficient because LLMs lack a native separation between instructions and data; they call for **defense-in-depth** around AI pipelines, including least-privilege agent permissions, auditable tool use, and stronger identity/workload controls as agent deployments multiply. Several items in the set are unrelated (geopolitical cyber activity, workforce/culture pieces, jobs, and product/market commentary) and do not materially inform the prompt-injection/agent-abuse story.

Mar 21, 2026

Prompt Injection Attacks and Security Challenges in AI Systems

Prompt injection has emerged as a critical security concern in the deployment of large language models (LLMs) and AI agents, with attackers exploiting the way these systems interpret and execute instructions. Security researchers have drawn parallels between prompt injection and earlier vulnerabilities like SQL injection, highlighting its potential to undermine the intended behavior of AI models. Prompt injection involves manipulating the input prompts to override or bypass the system-level instructions set by developers, leading to unauthorized actions or data leakage. The attack surface is broad, as LLMs are increasingly integrated into applications and workflows, making them attractive targets for adversaries. Multiple organizations, including OpenAI, Microsoft, and Anthropic, have initiated efforts to address prompt injection, but the problem remains unsolved due to the complexity and adaptability of AI models. Real-world demonstrations have shown that prompt injection can be used to break out of agentic applications, bypass browser security rules, and even persistently compromise AI systems through mechanisms like memory manipulation. Security conferences such as BlackHat USA 2024 have featured research on exploiting AI-powered tools like Microsoft 365 Copilot, where attackers can escalate privileges or exfiltrate data by crafting malicious prompts or leveraging markdown image vectors. Researchers have also identified that AI agents can be tricked into ignoring browser security policies, such as CORS, leading to potential cross-origin data leaks. Defensive measures, such as intentionally limiting AI capabilities or implementing stricter input filtering, have been adopted by some vendors, but these often come at the cost of reduced functionality. The security community is actively developing standards, such as the OWASP Agent Observability Standard, to improve monitoring and detection of prompt injection attempts. Despite these efforts, adversaries continue to find novel ways to exploit prompt injection, including dynamic manipulation of tool descriptions and bypassing image filtering mechanisms. The rapid evolution of AI technologies and the proliferation of agentic applications have made it challenging to keep pace with emerging threats. Security researchers emphasize the need for ongoing vigilance, robust testing, and collaboration across the industry to mitigate the risks associated with prompt injection. The use of AI in sensitive environments, such as enterprise productivity suites and web browsers, amplifies the potential impact of successful attacks. As AI adoption accelerates, organizations must prioritize understanding and defending against prompt injection to safeguard their systems and data. The ongoing research and public disclosures serve as a call to action for both developers and defenders to address this evolving threat landscape.

Mar 21, 2026

Indirect Prompt Injection and Data Exfiltration Risks in Enterprise AI Agents

Security researchers warned that **AI agents and retrieval-augmented generation (RAG) systems** can be turned into data-exfiltration channels when attackers poison inputs or embed malicious instructions in content the model is expected to process. One report described a **0-click indirect prompt injection** against *OpenClaw* agents in which hidden instructions cause the agent to generate an attacker-controlled URL containing sensitive data such as API keys or private conversations in query parameters; messaging platforms like *Telegram* or *Discord* can then automatically request that URL for link previews, silently delivering the data to the attacker. The same reporting noted concerns about insecure defaults that allow agents to browse, execute tasks, and access local files, expanding the blast radius of prompt-injection abuse. Related analysis highlighted that the same core weakness extends beyond standalone agents to **enterprise RAG deployments**, where the integrity of the knowledge base becomes part of the security boundary. If attackers can poison indexed documents in systems such as SharePoint or Confluence, they can manipulate retrieval results and influence model outputs, including security workflows and analyst guidance. Broader commentary on **agentic AI threat convergence** reinforced that prompt engineering is no longer just a productivity technique but an emerging exploit class, with adversaries using prompt injection and context manipulation against AI-enabled security operations. Together, the reporting shows that enterprise AI risk increasingly depends on controlling untrusted content, hardening agent permissions, and treating prompts, retrieved documents, and downstream integrations as attack surfaces.

May 7, 2026

Indirect Prompt Injection and Prompt Manipulation Risks in AI Agents

Get ahead of threats like this

How this story unfolded

Researchers identify first known real-world IDPI abuse of AI ad review

Unit 42 documents indirect prompt injection used in the wild at scale

Doctronic and Utah pilot program respond with safeguard claims

Researchers show Doctronic AI can be induced to alter prescription output

Mindgard demonstrates prompt injection risks in Doctronic healthcare AI

Related entities

Sources

Hackers Can Use Indirect Prompt Injection Allows Adversaries to Manipulate AI Agents with Content - Cyber Security News

Healthcare AI vulnerable to prompt injection, security experts warn | brief | SC Media

See the full picture, correlated to your attack surface.

Indirect Prompt Injection and Prompt Manipulation Risks in AI Agents

Get ahead of threats like this

How this story unfolded

Researchers identify first known real-world IDPI abuse of AI ad review

Unit 42 documents indirect prompt injection used in the wild at scale

Doctronic and Utah pilot program respond with safeguard claims

Researchers show Doctronic AI can be induced to alter prescription output

Mindgard demonstrates prompt injection risks in Doctronic healthcare AI

Related entities

Sources

Hackers Can Use Indirect Prompt Injection Allows Adversaries to Manipulate AI Agents with Content - Cyber Security News

Healthcare AI vulnerable to prompt injection, security experts warn | brief | SC Media

See the full picture, correlated to your attack surface.

Related stories

Indirect Prompt Injection and AI Agent Abuse Expands Real-World Attack Surface

Prompt Injection Attacks and Security Challenges in AI Systems

Indirect Prompt Injection and Data Exfiltration Risks in Enterprise AI Agents