AI Agent Security Risks Around MCP and Over-Privileged Tool Access
Security commentary warned that the Model Context Protocol (MCP) has introduced a major context-layer attack surface by letting AI agents trust external tools and content without adequate authorization, validation, or isolation. The SC Media piece argues that organizations built zero-trust controls for users and devices but then undermined them by granting AI agents broad implicit trust, citing rapid MCP adoption, thousands of internet-exposed MCP servers, and prior demonstrations in which malicious MCP infrastructure or poisoned content drove agents to exfiltrate sensitive data and perform unauthorized actions.
A related PyPI package, ciaf-agents, addresses the same broad problem space by proposing zero-trust execution boundaries for AI agents, including identity, authorization, mediation, elevation control, and auditability to prevent unauthorized data access and destructive operations. A separate post on building an agentic malware-analysis pipeline is not about MCP exposure or agent trust-boundary failures; it focuses on using LLM agents to improve reverse engineering and malware analysis workflows rather than documenting the same security issue or incident.
How this story unfolded
26 events from the most recent confirmed update back to the earliest known activity.
Zenity discloses AgentFlayer attack using Jira tickets to steal secrets
Zenity Labs published research titled 'AgentFlayer: When a Jira Ticket Can Steal Your Secrets,' describing an attack in which a malicious Jira ticket can act as untrusted context to induce an AI agent or MCP-connected workflow to expose secrets. The disclosure adds a new real-world prompt-injection and data-exfiltration scenario centered on enterprise ticketing systems.
GitHub adds dependency and secret scanning to its MCP Server
GitHub announced new security features for its GitHub MCP Server, adding dependency scanning in public preview and making secret scanning generally available. The capabilities are designed to surface vulnerable dependencies and exposed credentials directly inside MCP-connected AI coding workflows before code is committed.
Censys finds 12,520 Internet-exposed MCP services worldwide
Censys reported that as of 2026-04-28 it had identified 12,520 accessible MCP services across 8,758 unique IP addresses in 56 countries. The study warned that many publicly reachable servers exposed sensitive capabilities including data access, system control, communications, payment processing, and some command-execution-style tools, highlighting broad Internet exposure of MCP deployments.
Microsoft discloses Copilot Studio prompt-injection flaw CVE-2026-21520
Microsoft assigned CVE-2026-21520 to an indirect prompt-injection vulnerability in Copilot Studio, disclosed in mid-April 2026. The issue was described as enabling a data-exfiltration path, and Microsoft patched the specific exploit path while broader confused-deputy risks remained.
AWS publishes MCP security guidance for AI agent access to AWS resources
AWS published official guidance on securing AI agents and coding assistants that access AWS resources through MCP. The blog recommended scoped temporary credentials, governance over role usage, and mechanisms to distinguish AI-driven actions from human activity, including AWS-specific IAM context keys for AWS-managed MCP servers.
UK AI Safety Institute study maps rapid rise of action-capable MCP tools
A UK AI Safety Institute study of 177,436 MCP tools created between November 2024 and February 2026 found action-tool usage grew from 27% to 65%, with strong growth in computer-use, browser automation, and financial tooling such as payment-execution servers. The study also found increasing AI-assisted development of MCP servers and warned that the shift toward agents that can execute code, modify files, send emails, and interact with financial systems raises major enterprise security and governance risks.
ArXiv paper presents stealthy MCP injection payload generation method
An arXiv paper titled "Invisible Threats from Model Context Protocol: Generating Stealthy Injection Payload via Tree-based Adaptive Search" was published, describing a method for crafting stealthy prompt-injection payloads targeting MCP-based systems. The work adds a new technical research development focused on optimizing evasive MCP injection attacks rather than documenting a specific victim incident.
Researcher shows poisoned Context Hub docs can steer agents to malicious packages
Researcher Mickey Shmueli published a proof-of-concept showing that if malicious documentation were merged into Andrew Ng’s Context Hub repository, AI coding agents consuming it through MCP could be induced to add fake or malicious dependencies to generated projects. The report highlighted weak documentation sanitization in the pipeline and framed the issue as an indirect prompt-injection supply chain risk for community-authored agent documentation.
ArXiv paper models MCP threats and analyzes prompt injection via tool poisoning
An arXiv paper titled "Model Context Protocol Threat Modeling and Analyzing Vulnerabilities to Prompt Injection with Tool Poisoning" was published, presenting a threat-modeling treatment of MCP security and examining how prompt injection can be combined with poisoned tools. The work added formal analysis of MCP-specific attack paths beyond the previously documented incident examples and vendor disclosures.
SC Media warns MCP creates a new zero-trust blind spot
SC Media published a perspective arguing that MCP has opened a new context-layer attack surface that traditional zero-trust architectures do not adequately address. The article warned that a major enterprise breach mediated through MCP is likely unless organizations validate context inputs and treat MCP connectivity as privileged access.
CIAF-Agents package published with zero-trust controls for AI agents
The ciaf-agents package was published on PyPI, presenting a zero-trust framework for agent execution boundaries with IAM, PAM, mediated execution, and cryptographic audit receipts. The release framed excessive agent privilege and weak verification as key security risks in autonomous AI systems.
AgentSeal scan finds security issues on 66% of 1,808 MCP servers
AgentSeal published findings from a scan of 1,808 MCP servers, reporting that 66% had security findings. The research added ecosystem-wide empirical evidence that many exposed MCP deployments lacked adequate security controls.
Microsoft patches Azure MCP Server SSRF flaw CVE-2026-26118
Microsoft patched CVE-2026-26118 on 2026-03-10, fixing a server-side request forgery vulnerability in Azure MCP Server. The flaw could allow an authorized attacker to steal the server’s managed identity token and elevate network privileges.
OWASP launches MCP Top 10 beta security project
OWASP published its MCP Top 10 project in beta, outlining ten major security risk categories for Model Context Protocol deployments such as token theft, excessive permissions, tool poisoning, command execution, and shadow MCP servers. The framework was presented as a living document for pilot testing and mitigation guidance around MCP-enabled AI systems.
ArXiv paper explores abusing MCP for LLM-powered agentic red teaming
An arXiv paper titled "Hiding in the AI Traffic: Abusing MCP for LLM-Powered Agentic Red Teaming" was published, examining how MCP can be abused to support LLM-driven red-team operations. The work added a new research angle on offensive use of MCP and how malicious activity could blend into normal AI-agent traffic.
ArXiv study finds ambiguous MCP tool identity enables wrong-provider execution
An arXiv paper comparing MCP, A2A, Agora, and ANP identified immature protocol-level security across emerging AI-agent protocols and highlighted MCP’s lack of mandatory cryptographic binding between tools and providers. In MCP v1.25.0 experiments, colliding tool names led to wrong-provider tool execution under realistic resolver policies, which the authors described as a systemic design issue and proposed mitigating with provider-bound identities validated by certificates or signatures.
Compromised postmark-mcp update adds hidden email exfiltration behavior
In September 2025, the postmark-mcp package was reportedly compromised in version 1.0.16 to silently add a BCC to attacker-controlled email addresses while preserving normal functionality. The incident illustrated how a malicious MCP server update could conceal supply-chain exfiltration under a developer’s existing user context.
Lasso publishes IdentityMesh research on lateral movement in agentic AI
Lasso Security published research titled 'IdentityMesh: Exploiting Lateral Movement in Agentic Systems,' describing how agentic AI environments can enable lateral movement through interconnected identities and tool access. The work added a new attack-path framing beyond prompt injection and exfiltration by focusing on identity and privilege propagation across agent workflows.
Invariant Labs publishes Toxic Flows research on MCP prompt-injection risks
Invariant Labs published research describing 'Toxic Flows,' a novel prompt-injection attack framework affecting agentic systems and MCP servers. The work highlighted new ways untrusted context can propagate through agent workflows to trigger unsafe actions or data exposure.
MCP adds OAuth support to reduce credential-sharing risks
In March 2025, the Model Context Protocol added OAuth support to reduce the need for locally stored API keys and other shared credentials in MCP deployments. The change was intended to mitigate credential exposure risks, although many implementations reportedly continued relying on local secrets.
JFrog discloses CVE-2025-6514 in MCP server implementations
JFrog disclosed CVE-2025-6514 in 2025, an OS command-injection vulnerability affecting MCP server scenarios with a CVSS score of 9.6. The flaw could allow remote code execution when clients interacted with untrusted MCP servers.
Supabase Cursor agent abused through malicious support tickets
In 2025, attackers demonstrated that malicious support tickets could be used to abuse Supabase's Cursor agent. The example showed how hostile context entering an agent workflow could trigger unauthorized actions or data exposure.
Private repository data leakage shown through GitHub issues
A 2025 incident showed that GitHub issue content could be abused to cause leakage of private repository data through AI agent workflows. The case illustrated the risk of treating untrusted issue text as safe context for agent actions.
Invariant Labs demonstrates WhatsApp data exfiltration via MCP
Invariant Labs demonstrated an MCP-related attack in 2025 in which malicious context could drive an AI agent to exfiltrate WhatsApp data. The incident highlighted how untrusted inputs could manipulate agent behavior without compromising the underlying model.
Major platforms adopt MCP for agentic AI integrations
Platforms including Microsoft Copilot Studio and Azure AI Foundry adopted MCP as part of their agentic AI offerings, accelerating enterprise use of the protocol despite limited built-in security controls.
Anthropic introduces the Model Context Protocol
Anthropic introduced the Model Context Protocol (MCP) in late 2024, creating a standardized way for AI agents to connect to external tools and data sources. The protocol was then rapidly adopted across agentic AI ecosystems.
Related entities
Vulnerabilities, threat actors, malware, products, organizations, and breaches Mallory has linked to this story.
Sources
36 references tracked. Mallory keeps watching after this page renders.
AI Security: Lateral Movement & MCP Protocol Risks #shorts | SecuritySenses
securitysenses.com
Open sourceMCP Servers on the Internet - Censys
censys.com
Open sourceAgentFlayer: When a Jira Ticket Can Steal Your Secrets
labs.zenity.io
Open sourceHow to Build Granular Policy Enforcement for Secure Model Context Protocol Deployments | Read the Gopher Security's Quantum Safety Blog
gopher.security
Open source5 MCP security risks and mitigation strategies | TechTarget
techtarget.com
Open sourceIdentityMesh: Exploiting Lateral Movement in Agentic Systems
lasso.security
Open sourceInvariant Labs Exposes Novel Prompt Injection Attack Vulnerabilities, “Toxic Flows,” in Agentic Systems & MCP Servers
invariantlabs.ai
Open sourcePoison everywhere: No output from your MCP server is safe
cyberark.com
Open sourceSee the full picture, correlated to your attack surface.
Map indicators from this story to your assets and identify affected systems in minutes.
Every observed campaign, victim, and pivot linked to actors named in this story.
Malware, exploits, and IOCs connected to the activity described here.
YARA, Sigma, and Snort rules deployed to your SIEM as soon as they’re published.
Get matching new stories delivered to your team as they break — not the next morning.
Ask questions about this story and take action on the answers.



