ai-platform-securitydata-exfiltration-methodfinancial-sector-threat

Prompt Injection and Jailbreak Techniques Targeting LLM-Powered Applications

Updated 11d agoFirst seen Jan 26, 202660 sources

Security researchers and vendors are warning that prompt injection and jailbreak techniques remain a leading risk for enterprise deployments of large language models (LLMs), enabling attackers to override system instructions, bypass safety controls, and potentially drive data exposure outcomes. Resecurity reports assisting a Fortune 100 organization where AI-powered banking and HR applications were targeted with prompt-injection attempts, emphasizing that these attacks exploit model behavior rather than traditional software flaws and can be used in scenarios such as extracting sensitive configuration data (for example, attempts to elicit content resembling /etc/passwd). Resecurity also cites OWASP’s 2025 Top 10 for LLM Applications, where prompt injection is ranked as the top issue, and frames continuous security testing (e.g., VAPT) as a key control for enterprise AI systems.

Separate research highlighted by Kaspersky describes a “poetry” jailbreak technique in which prompts framed as rhyming verse increased the likelihood that chatbots would produce disallowed or unsafe responses; the study tested this approach across 25 models from multiple vendors (including Anthropic, OpenAI, Google, Meta, DeepSeek, and xAI). In contrast, OpenAI’s planned upgrade to ChatGPT Temporary Chat is primarily a product/privacy change—adding optional personalization while keeping temporary chats out of history and model training (with possible retention for up to 30 days)—and does not describe a specific security incident or vulnerability disclosure tied to prompt injection or jailbreak research.

Prompt Injection and Jailbreak Techniques Targeting LLM-Powered Applications

Stay ahead

Get ahead of threats like this

Mallory correlates global threat intelligence with your attack surface — know if you’re exposed before adversaries strike.

Start free trial

EVENT TIMELINE

How this story unfolded

48 events from the most recent confirmed update back to the earliest known activity.

48 EVENTS

May 25, 202628d ago

Zenity reports theft of Microsoft Copilot system prompt

Zenity Labs published research describing how Microsoft Copilot's hidden system prompt could be extracted, highlighting a concrete prompt-injection-style disclosure affecting a deployed AI assistant. The report added a specific public example of system-prompt exposure in a real-world Copilot product.

Stealing Copilot's System Prompt

May 23, 202630d ago

Meta proposes 'Agents Rule of Two' for AI agent security

Meta published guidance describing prompt injection as a fundamental unresolved weakness in LLM agents and introduced the 'Agents Rule of Two' security model. The framework says an agent session should satisfy no more than two of three properties—handling untrusted input, accessing sensitive data or systems, and taking external actions—and recommends human supervision when all three are needed.

Agents Rule of Two: A Practical Approach to AI Agent Security

May 21, 20261mo ago

A3S-Bench paper exposes evasion weaknesses in autonomous agents

A research paper introduced a three-part evasion framework—temporal, spatial, and semantic evasion—to test stateful LLM-based autonomous agents with deep system privileges. Using the new A3S-Bench dataset of 2,254 real-world agent execution trajectories across 20 threat scenarios and 10 LLM backbones, the study reported average risk trigger rates rising from 28.3% to 52.6%, highlighting systemic architectural weaknesses.

[2605.22321] Benchmarking Autonomous Agents against Temporal, Spatial, and Semantic Evasions

May 17, 20261mo ago

LinkedIn profile prompt injection manipulates recruiter AI outreach

A software developer using the name tmuxvim embedded prompt-injection text in a LinkedIn profile instructing any AI reader to address them as 'My Lord' and write in Old English. At least one recruiter message reportedly followed those instructions, demonstrating prompt-injection risk in AI-assisted recruiting workflows that ingest untrusted profile content.

LinkedIn recruitment spam becomes Olde English prose after user hides AI prompt injection in bio - bots also also manipulated to address user as ‘My Lord’ | Tom's Hardware

May 13, 20261mo ago

PortSwigger lab shows indirect prompt injection leading to stored XSS

A walkthrough of a PortSwigger Web Security Academy lab demonstrated that indirect prompt injection in an LLM-powered live chat application can cause the model to emit malicious content that is rendered unsafely, resulting in stored cross-site scripting. The attack chain was shown to bypass AI safety filters and could be used to automatically delete another user’s account, highlighting insecure handling of AI-generated output.

🚨 Exploiting Insecure Output Handling in LLMs via Indirect Prompt Injection (XSS) | by Mukilan Baskaran | May, 2026 | InfoSec Write-ups

May 2, 20262mo ago

OSINT Team blog tests five prompt-injection defenses and finds four fail

An OSINT Team blog post described testing five common prompt-injection defenses across enterprise-style LLM environments including document assistants, coding agents, browser agents, customer support systems, and workflow automations. The evaluation found stronger system prompts, keyword filtering, pattern matching, and classifier layers were unreliable, while context segmentation and permission boundaries with capability isolation materially improved resilience.

OWASP LLM01 in 2026: I Tested the Top 5 Defenses, 4 Failed | by Aeon Flex, Elriel Assoc. 2133 [NEON MAXIMA] | May, 2026 | OSINT Team

Apr 30, 20262mo ago

Capital One proposes adaptive automated LLM red-teaming framework

Researchers from Capital One’s AI Foundations group introduced Adaptive Instruction Composition, an automated jailbreak testing framework that uses a contextual bandit to learn effective query-and-tactic combinations instead of relying on random combinations. In simulations against Mistral-7B and Llama models, the method reportedly more than doubled WildTeaming’s attack success rate and showed cross-model transferability of learned jailbreak strategies.

Automated LLM red teaming gets a learning layer - Help Net Security

Apr 23, 20262mo ago

Google studies prompt injections on the public web using Common Crawl

Google Security disclosed a threat-intelligence study examining whether indirect prompt injection is being operationalized on the public web by scanning Common Crawl data for known patterns. The company said large-scale detection is difficult because many apparent prompt injections are false positives appearing in benign contexts such as research and educational content.

AI threats in the wild: The current state of prompt injections on the web

Apr 10, 20262mo ago

Trend Micro discloses 'sockpuppeting' jailbreak affecting 11 AI models

Trend Micro detailed a new black-box jailbreak technique called 'sockpuppeting' that abuses assistant-prefill support to inject a fake compliant response and bypass safety guardrails in 11 major LLMs. The researchers reported impacts including generation of malicious exploit code and disclosure of system prompts, and said API-level blocking of assistant prefills is the strongest defense.

Single Line of Code Can jailbreak 11 AI models Including ChatGPT, Claude, and Gemini

Apr 9, 20262mo ago

OWASP ASI01 guide ranks Agent Goal Hijack as top agentic AI risk

An Adversa technical guide for the 2026 OWASP Agentic Security Initiative described ASI01, Agent Goal Hijack, as the highest-ranked risk for agentic AI systems because attackers can redirect agent objectives through untrusted inputs, tool outputs, inter-agent messages, or configuration tampering. The guide outlined five trust-boundary attack vectors, cited examples including EchoLeak and a GitHub Copilot CVE, and recommended layered mitigations such as strict trust boundaries, least privilege, human approval for sensitive actions, and intent-preserving architectures.

OWASP ASI01: Agent Goal Hijack - Full technical guide | Adversa AI

Mar 30, 20263mo ago

NDSS paper proposes attention-based defense for indirect prompt injection

An NDSS Symposium paper titled 'Attention is All You Need to Defend Against Indirect Prompt Injection Attacks in LLMs' was published, presenting a defense approach aimed specifically at mitigating indirect prompt injection attacks. Based on the title, the work focuses on using attention-related mechanisms as a technical mitigation distinct from previously tracked prompt-injection research.

Attention is All You Need to Defend Against Indirect Prompt Injection Attacks in LLMs - NDSS Symposium

Mar 16, 20263mo ago

ArXiv paper analyzes AI-agent indirect prompt injection via public competition

An arXiv paper titled 'How Vulnerable Are AI Agents to Indirect Prompt Injections? Insights from a Large-Scale Public Competition' was published, presenting a distinct research study on AI agents' susceptibility to indirect prompt injection. Based on the title, the work derives findings from a large-scale public competition, making it separate from previously tracked defense papers and observational studies.

[2603.15714] How Vulnerable Are AI Agents to Indirect Prompt Injections? Insights from a Large-Scale Public Competition

Mar 11, 20263mo ago

ArXiv paper proposes causal-attribution defense for indirect prompt injection

An arXiv paper titled 'AttriGuard: Defeating Indirect Prompt Injection in LLM Agents via Causal Attribution of Tool Invocations' was published, presenting a defense approach for LLM agents against indirect prompt injection. Based on the title, the work focuses on using causal attribution of tool invocations to identify or block malicious influence, making it distinct from previously tracked parsing-based and attention-based defenses.

[2603.10749] AttriGuard: Defeating Indirect Prompt Injection in LLM Agents via Causal Attribution of Tool Invocations

Mar 1, 20264mo ago

Cloudflare identifies indirect prompt code injection in malicious Workers

Cloudforce One reported that in March 2026 it found malicious or abusive Cloudflare Workers containing multilingual commented 'Notice to AI' lures designed to manipulate AI-based security auditing systems into classifying harmful code as benign. In a later study across 100 confirmed malicious Workers and seven LLMs, Cloudflare found evasion depended more on file structure, comment density, and size than on the deceptive wording alone, and recommended mitigations such as comment stripping and prioritizing functional code.

Prompt Injection Attacks on AI Security Auditors | Cloudflare | Cloudflare

Feb 16, 20264mo ago

GitHub report discloses Chain-of-Logic Injection jailbreak as CVE-2026-3098

A GitHub repository/report titled 'LLM Jailbreak via Chain-of-Logic Injection' was published and associated the technique with CVE-2026-3098. The reference indicates a distinct jailbreak disclosure centered on a named attack method and CVE tracking, separate from previously noted prompt-injection and jailbreak research.

GitHub - George0Papasotiriou/LLM-Jailbreak-via-Chain-of-Logic-Injection-CVE-2026-3098: Report · GitHub

Jan 26, 20265mo ago

OpenReview paper proposes nullspace steering for controlled model subversion

An OpenReview paper titled 'Jailbreaking the Matrix: Nullspace Steering for Controlled Model Subversion' was published, presenting a distinct jailbreak technique based on nullspace steering. Based on the title, the work appears to focus on controlled manipulation of model behavior as a new technical approach to model subversion.

Jailbreaking the Matrix: Nullspace Steering for Controlled Model Subversion | OpenReview

Jan 25, 20265mo ago

Resecurity details prompt-injection risks and simulated data disclosure

Resecurity published an analysis describing prompt injection as a leading security risk for enterprise AI applications, outlining direct and indirect injection techniques and a scenario in which an AI HR assistant is manipulated into disclosing a simulated /etc/passwd file. The article also recommended mitigations such as least-privilege tool access, input and output validation, segregation of untrusted content, and continuous adversarial testing.

Jan 24, 20265mo ago

ArXiv paper analyzes prompt injection in agentic coding assistants

An arXiv paper titled 'Prompt Injection Attacks on Agentic Coding Assistants: A Systematic Analysis of Vulnerabilities in Skills, Tools, and Protocol Ecosystems' was published, presenting a distinct research effort focused on how prompt injection affects agentic coding assistants. Based on the title, the work examines vulnerabilities spanning assistant skills, integrated tools, and protocol ecosystems rather than general-purpose prompt injection alone.

[2601.17548] Prompt Injection Attacks on Agentic Coding Assistants: A Systematic Analysis of Vulnerabilities in Skills, Tools, and Protocol Ecosystems

Jan 23, 20265mo ago

Study finds poetic prompts can jailbreak major LLMs

Researchers tested rhyming versions of malicious prompts from the MLCommons AILuminate Benchmark against 25 popular models and found that poetry significantly increased the likelihood of unsafe responses. Using a hand-picked set of 20 effective poetic prompts, they reported an average attack success rate of about 62%, with some models such as Gemini 1.5 Pro reportedly bypassed consistently under that metric.

Jan 8, 20266mo ago

ArXiv paper proposes tool-result parsing defense for indirect prompt injection

An arXiv paper titled 'Defense Against Indirect Prompt Injection via Tool Result Parsing' was published, presenting a mitigation approach focused on parsing tool outputs to reduce indirect prompt injection risk. Based on the title, the work targets attacks delivered through external tool results rather than broader prompt-level or attention-based defenses.

[2601.04795] Defense Against Indirect Prompt Injection via Tool Result Parsing

Paper proposes prompt-injection defense via data synthesis and CoT learning

An arXiv paper titled 'Know Thy Enemy: Securing LLMs Against Prompt Injection via Diverse Data Synthesis and Instruction-Level Chain-of-Thought Learning' was published, presenting a research approach to harden LLMs against prompt injection. The work appears to focus on defensive training and evaluation using synthesized diverse data and instruction-level chain-of-thought learning.

[2601.04666] Know Thy Enemy: Securing LLMs Against Prompt Injection via Diverse Data Synthesis and Instruction-Level Chain-of-Thought Learning

Dec 9, 20257mo ago

UK NCSC issues warning on growing AI prompt injection risks

The UK National Cyber Security Centre issued a public warning highlighting prompt injection as a growing security risk in AI systems. The advisory appears to represent an official government cybersecurity response, emphasizing the threat posed by untrusted inputs manipulating LLM behavior.

NCSC issues urgent warning over growing AI prompt injection risks - here’s what you need to know | IT Pro

Nov 25, 20257mo ago

ArXiv paper studies and defends against prompt injection in AI browser agents

An arXiv paper titled 'BrowseSafe: Understanding and Preventing Prompt Injection Within AI Browser Agents' was published, presenting research focused specifically on how prompt injection affects AI browser agents and how to mitigate it. Based on the title, the work combines threat analysis with defensive techniques tailored to browser-based agent workflows.

[2511.20597] BrowseSafe: Understanding and Preventing Prompt Injection Within AI Browser Agents

Nov 6, 20258mo ago

Zenity describes Data-Structure Injection in AI agents

Zenity Labs published research on a prompt-injection-related attack class called Data-Structure Injection (DSI) affecting AI agents. The work appears to define or characterize a distinct technique involving malicious manipulation of structured data consumed by agents, expanding the taxonomy of agent injection risks beyond previously tracked general prompt-injection studies.

Data-Structure Injection (DSI) in AI Agents

Nov 2, 20258mo ago

Paper publishes 'The Attacker Moves Second' on prompt injection

A paper titled 'The Attacker Moves Second' was published as part of a set of newly noted prompt-injection research. Based on the reference title, it represents a distinct research contribution separate from Meta's already-tracked 'Agents Rule of Two' guidance.

New prompt injection papers: Agents Rule of Two and The Attacker Moves Second

Oct 30, 20258mo ago

Paper proposes task-centric access control against instruction injection

An arXiv paper titled 'Who Grants the Agent Power? Defending Against Instruction Injection via Task-Centric Access Control' was published, presenting a defense approach for LLM agents against instruction injection. Based on the title, the work focuses on limiting agent power through task-centric access-control mechanisms rather than relying only on prompt-level safeguards.

[2510.26212] Who Grants the Agent Power? Defending Against Instruction Injection via Task-Centric Access Control

Oct 29, 20258mo ago

OpenPromptInjection benchmark toolkit published on GitHub

A GitHub repository called OpenPromptInjection was published as an open-source toolkit for implementing, evaluating, and extending prompt injection attacks, defenses, and LLM-integrated applications. The project included example attack-evaluation workflows and documented two defensive components, DataSentinel for detection and PromptLocate for localization and recovery of injected prompts.

GitHub - liu00222/Open-Prompt-Injection: This repository provides a benchmark for prompt injection attacks and defenses in LLMs · GitHub

Oct 10, 20259mo ago

Study shows adaptive attacks bypass 12 LLM jailbreak defenses

A 2025 research paper argued that common evaluations of LLM jailbreak and prompt-injection defenses are inadequate because they rely on static or weak attacks rather than adaptive adversaries. Using tuned optimization methods including gradient descent, reinforcement learning, random search, and human-guided exploration, the researchers reported bypassing 12 recent defenses, with attack success rates above 90% for most of them despite prior near-zero claims.

[2510.09023] The Attacker Moves Second: Stronger Adaptive Attacks Bypass Defenses Against Llm Jailbreaks and Prompt Injections

Aug 4, 202511mo ago

ArXiv paper studies large reasoning models as autonomous jailbreak agents

An arXiv paper titled 'Large Reasoning Models Are Autonomous Jailbreak Agents' was published, presenting a distinct research contribution on how large reasoning models may autonomously develop or execute jailbreak behavior. Based on the title, the work focuses on reasoning-capable models as active jailbreak agents rather than on a specific defense, benchmark, or previously tracked attack technique.

[2508.04039] Large Reasoning Models Are Autonomous Jailbreak Agents

Jun 10, 20251y ago

Research paper proposes secure design patterns for LLM agents

A June 2025 research paper examined prompt injection as a major risk for LLM-powered agents, especially those with tool access or sensitive data exposure. The authors proposed principled design patterns intended to provide provable resistance to prompt injection and analyzed their security-versus-utility trade-offs through case studies.

[2506.08837] Design Patterns for Securing LLM Agents against Prompt Injections

Jun 1, 20251y ago

ArXiv paper presents practical 'Promptware' attacks on production assistants

An arXiv paper titled 'Invitation Is All You Need! Promptware Attacks Against LLM-Powered Assistants in Production Are Practical and Dangerous' was published, presenting research on promptware attacks against production LLM-powered assistants. Based on the title, the work argues these attacks are practically exploitable in real deployed assistant environments and represents a distinct contribution from previously tracked prompt-injection and agent-tool abuse studies.

Invitation Is All You Need! Promptware Attacks Against LLM-Powered Assistants in Production Are Practical and Dangerous

Apr 29, 20251y ago

CyberArk describes jailbreak discovery using AI explainability

CyberArk published research on using AI explainability techniques to uncover new jailbreak methods against large language models. The work appears to present a distinct technical approach for identifying or crafting jailbreaks, separate from later studies on adaptive attacks and automated red-teaming.

Unlocking New Jailbreaks with AI Explainability

Apr 11, 20251y ago

Simon Willison highlights CaMeL as a new prompt-injection mitigation approach

Simon Willison published a post discussing CaMeL as a promising new direction for mitigating prompt injection attacks. The reference indicates a distinct public discussion of a specific defense approach not yet represented in the timeline.

CaMeL offers a promising new direction for mitigating prompt injection attacks

Dec 11, 20242y ago

Black Hat Europe talk presents 'SpAIware' prompt injection exploits

At Black Hat Europe 2024, Johann Rehberger presented 'SpAIware: Advanced Prompt Injection Exploits in AI Assistants,' publicly detailing advanced prompt-injection attack techniques against AI assistants and agent-like workflows. The talk helped formalize and publicize exploit chains showing how untrusted content can manipulate LLM-connected tools and actions.

I Blackhat Archive

Oct 28, 20242y ago

ArXiv paper analyzes prompt injection across diverse LLM architectures

An arXiv paper titled 'Systematically Analyzing Prompt Injection Vulnerabilities in Diverse LLM Architectures' was published, presenting a distinct research effort focused on evaluating prompt injection weaknesses across different LLM architectures. Based on the title, the work appears to contribute comparative technical analysis rather than general commentary or mitigation guidance alone.

[2410.23308] Systematically Analyzing Prompt Injection Vulnerabilities in Diverse LLM Architectures

Oct 19, 20242y ago

ArXiv paper studies improper tool use attacks on LLM agents

An arXiv paper titled 'Imprompter: Tricking LLM Agents into Improper Tool Use' was published, presenting research on how LLM agents can be manipulated into misusing their tools. Based on the title, the work focuses on agent-specific prompt-injection or jailbreak behavior involving improper tool invocation rather than general LLM prompt injection alone.

[2410.14923] Imprompter: Tricking LLM Agents into Improper Tool Use

Oct 17, 20242y ago

WIRED reports 'Imprompter' attack extracting personal details via LLM agents

WIRED reported research showing that a prompt-based attack dubbed 'Imprompter' could manipulate AI chatbots and agents into identifying and extracting personal details from user chats. The coverage highlighted prompt-injection risks tied to tool use and sensitive-data access in LLM-powered assistants.

This Prompt Can Make an AI Chatbot Identify and Extract Personal Details From Your Chats | WIRED

Aug 20, 20242y ago

Slack AI data exfiltration via indirect prompt injection discussed

A Hacker News reference highlighted data exfiltration from Slack AI through an indirect prompt injection technique. This appears to be a distinct public disclosure/example of prompt injection impacting an enterprise AI assistant workflow.

Data Exfiltration from Slack AI via indirect prompt injection | Hacker News

Jul 2, 20242y ago

InjecAgent GitHub project published for agent prompt-injection research

The UIUC Kang Lab published the InjecAgent GitHub repository, indicating a distinct public release related to prompt-injection attacks or evaluation in LLM agents. Based on the project name and timing, it represents a separate research artifact focused on agent-specific injection risks not already reflected in the timeline.

GitHub - uiuc-kang-lab/InjecAgent · GitHub

May 13, 20242y ago

Schneier frames prompt injection as LLM data/control-path insecurity

Bruce Schneier published an article arguing that prompt injection in LLMs is a modern instance of the classic security failure of mixing data and control on the same channel. The piece warned that general-purpose LLMs operating on untrusted content are inherently hard to secure against this class of attack and suggested narrower AI systems may be safer in adversarial settings.

LLMs’ Data-Control Path Insecurity - Schneier on Security

Apr 19, 20242y ago

ArXiv paper proposes instruction hierarchy training against prompt injection

An arXiv paper titled 'The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions' was published, presenting a training-based approach to make language models follow higher-priority trusted instructions over lower-priority untrusted ones. Based on the title, the work represents an early defensive research contribution aimed at reducing prompt-injection susceptibility through instruction prioritization.

[2404.13208] The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions

Mar 20, 20242y ago

ArXiv paper proposes spotlighting defense for indirect prompt injection

An arXiv paper titled 'Defending Against Indirect Prompt Injection Attacks With Spotlighting' was published, presenting a defense approach aimed specifically at mitigating indirect prompt injection attacks. Based on the title, the work focuses on a distinct 'spotlighting' mitigation technique separate from later parsing-based and attention-based defenses.

[2403.14720] Defending Against Indirect Prompt Injection Attacks With Spotlighting

Feb 9, 20242y ago

ArXiv paper proposes StruQ structured-query defense against prompt injection

An arXiv paper titled 'StruQ: Defending Against Prompt Injection with Structured Queries' was published, presenting a defense that separates trusted prompts from untrusted user data into distinct channels. The approach combines a secure front end with a specially trained model fine-tuned to ignore instructions embedded in the data channel while preserving utility and output quality.

[2402.06363] StruQ: Defending Against Prompt Injection with Structured Queries

Apr 14, 20233y ago

Simon Willison reports markdown-image prompt injection against ChatGPT web

Simon Willison documented a prompt injection attack against the ChatGPT web interface in which markdown images could be used to exfiltrate chat data. The disclosure highlighted an early practical example of prompt injection leading to data theft in a deployed consumer LLM product.

New prompt injection attack on ChatGPT web version. Markdown images can steal your chat data

Feb 10, 20233y ago

Prompt injection attack causes Bing Chat to reveal hidden instructions

Ars Technica reported that Microsoft’s AI-powered Bing Chat could be manipulated via prompt injection to disclose parts of its hidden system prompt and internal operating rules. The incident became an early high-profile public example of prompt injection affecting a deployed consumer LLM product.

AI-powered Bing Chat spills its secrets via prompt injection attack [Updated] - Ars Technica

Nov 17, 20224y ago

ArXiv paper introduces PromptInject attack framework for language models

An arXiv paper titled 'Ignore Previous Prompt: Attack Techniques For Language Models' was published, examining how malicious inputs can misalign transformer-based language models in production-like settings. The authors introduced the PromptInject framework and studied goal hijacking and prompt leaking as distinct attack classes against customer-facing LLM applications.

[2211.09527] Ignore Previous Prompt: Attack Techniques For Language Models

Sep 12, 20224y ago

Simon Willison launches prompt injection series on LLM security

Simon Willison published a series of posts focused on prompt injection as a security vulnerability in software built on top of large language models. The series argued the problem was largely unsolved, distinguished it from jailbreaking, and emphasized reducing blast radius and constraining tool access as the safest practical mitigation approach.

Simon Willison: Prompt injection

Crescendo multi-turn jailbreak technique publicly released

A public project page for 'Crescendo' described a distinct multi-turn jailbreak technique for large language models. The reference indicates a named attack method focused on iterative or conversational jailbreak behavior, separate from previously tracked single-technique jailbreak and prompt-injection studies.

Crescendo

LINKED ENTITIES

Related entities

Vulnerabilities, threat actors, malware, products, organizations, and breaches Mallory has linked to this story.

38 LINKEDOpen in app

Affected products

11 linked

Claude Code1passwordCursorDockerOpenclawGemini-CliTrufflehogDeepseek-R1OllamaVllmOpentelemetry

Organizations

27 linked

GoogleOpenaiMicrosoft CorporationTom's HardwareLinkedinTopTech VenturesAnthropicReplitAmazon Web ServicesInternational Business Machines1passwordGitHubSophosOpenclawTrend MicroNvidiaPortswiggerDeepseekMistral AICapital OneMeta PlatformsResecurityxAIQwenMLCommonsOllamaMoonshot AI

SOURCE COVERAGE

Sources

50 references tracked. Mallory keeps watching after this page renders.

50 SOURCESView all

Osint Team BlogNews

Jun 2, 2026

OWASP LLM01 in 2026: I Tested the Top 5 Defenses, 4 Failed | by Aeon Flex, Elriel Assoc. 2133 [NEON MAXIMA] | May, 2026 | OSINT Team

osintteam.blog

Open source

ArxivNews

May 21, 2026

[2605.22321] Benchmarking Autonomous Agents against Temporal, Spatial, and Semantic Evasions

arxiv.org

Open source

Toms HardwareNews

May 17, 2026

LinkedIn recruitment spam becomes Olde English prose after user hides AI prompt injection in bio - bots also also manipulated to address user as ‘My Lord’ | Tom's Hardware

tomshardware.com

Open source

Infosec WriteupsNews

May 13, 2026

🚨 Exploiting Insecure Output Handling in LLMs via Indirect Prompt Injection (XSS) | by Mukilan Baskaran | May, 2026 | InfoSec Write-ups

infosecwriteups.com

Open source

42 additional sources from 19-04-2024 to 12-05-2026

MetaNews

Agents Rule of Two: A Practical Approach to AI Agent Security

ai.meta.com

Open source

Labs ZenityNews

Stealing Copilot's System Prompt

labs.zenity.io

Open source

I Blackhat ArchiveNews

I Blackhat Archive

i.blackhat.com

Open source

UnclassifiedNews

Crescendo

crescendo-the-multiturn-jailbreak.github.io

Open source

The operational view lives in Mallory

See the full picture, correlated to your attack surface.

This page covers what’s public. Mallory adds the parts that aren’t — which of your assets are affected, which threat actors are using it right now, which detections to deploy, and what to do next.

Start free trial

Exposure mapping

Map indicators from this story to your assets and identify affected systems in minutes.

Threat actor evidence

Every observed campaign, victim, and pivot linked to actors named in this story.

Associated malware

Malware, exploits, and IOCs connected to the activity described here.

Detection signatures

YARA, Sigma, and Snort rules deployed to your SIEM as soon as they’re published.

Scheduled alerts

Get matching new stories delivered to your team as they break — not the next morning.

AI threads

Ask questions about this story and take action on the answers.

Prompt Injection and Jailbreak Techniques Targeting LLM-Powered Applications

Get ahead of threats like this

How this story unfolded

Zenity reports theft of Microsoft Copilot system prompt

Meta proposes 'Agents Rule of Two' for AI agent security

A3S-Bench paper exposes evasion weaknesses in autonomous agents

LinkedIn profile prompt injection manipulates recruiter AI outreach

PortSwigger lab shows indirect prompt injection leading to stored XSS

OSINT Team blog tests five prompt-injection defenses and finds four fail

Capital One proposes adaptive automated LLM red-teaming framework

Google studies prompt injections on the public web using Common Crawl

Trend Micro discloses 'sockpuppeting' jailbreak affecting 11 AI models

OWASP ASI01 guide ranks Agent Goal Hijack as top agentic AI risk

NDSS paper proposes attention-based defense for indirect prompt injection

ArXiv paper analyzes AI-agent indirect prompt injection via public competition

ArXiv paper proposes causal-attribution defense for indirect prompt injection

Cloudflare identifies indirect prompt code injection in malicious Workers

GitHub report discloses Chain-of-Logic Injection jailbreak as CVE-2026-3098

OpenReview paper proposes nullspace steering for controlled model subversion

Resecurity details prompt-injection risks and simulated data disclosure

ArXiv paper analyzes prompt injection in agentic coding assistants

Study finds poetic prompts can jailbreak major LLMs

ArXiv paper proposes tool-result parsing defense for indirect prompt injection

Paper proposes prompt-injection defense via data synthesis and CoT learning

UK NCSC issues warning on growing AI prompt injection risks

ArXiv paper studies and defends against prompt injection in AI browser agents

Zenity describes Data-Structure Injection in AI agents

Paper publishes 'The Attacker Moves Second' on prompt injection

Paper proposes task-centric access control against instruction injection

OpenPromptInjection benchmark toolkit published on GitHub

Study shows adaptive attacks bypass 12 LLM jailbreak defenses

ArXiv paper studies large reasoning models as autonomous jailbreak agents

Research paper proposes secure design patterns for LLM agents

ArXiv paper presents practical 'Promptware' attacks on production assistants

CyberArk describes jailbreak discovery using AI explainability

Simon Willison highlights CaMeL as a new prompt-injection mitigation approach

Black Hat Europe talk presents 'SpAIware' prompt injection exploits

ArXiv paper analyzes prompt injection across diverse LLM architectures

ArXiv paper studies improper tool use attacks on LLM agents

WIRED reports 'Imprompter' attack extracting personal details via LLM agents

Slack AI data exfiltration via indirect prompt injection discussed

InjecAgent GitHub project published for agent prompt-injection research

Schneier frames prompt injection as LLM data/control-path insecurity

ArXiv paper proposes instruction hierarchy training against prompt injection

ArXiv paper proposes spotlighting defense for indirect prompt injection

ArXiv paper proposes StruQ structured-query defense against prompt injection

Simon Willison reports markdown-image prompt injection against ChatGPT web

Prompt injection attack causes Bing Chat to reveal hidden instructions

ArXiv paper introduces PromptInject attack framework for language models

Simon Willison launches prompt injection series on LLM security

Crescendo multi-turn jailbreak technique publicly released

Related entities

Sources

OWASP LLM01 in 2026: I Tested the Top 5 Defenses, 4 Failed | by Aeon Flex, Elriel Assoc. 2133 [NEON MAXIMA] | May, 2026 | OSINT Team

[2605.22321] Benchmarking Autonomous Agents against Temporal, Spatial, and Semantic Evasions

LinkedIn recruitment spam becomes Olde English prose after user hides AI prompt injection in bio - bots also also manipulated to address user as ‘My Lord’ | Tom's Hardware

🚨 Exploiting Insecure Output Handling in LLMs via Indirect Prompt Injection (XSS) | by Mukilan Baskaran | May, 2026 | InfoSec Write-ups

Agents Rule of Two: A Practical Approach to AI Agent Security

Stealing Copilot's System Prompt

I Blackhat Archive

Crescendo

See the full picture, correlated to your attack surface.

Related stories

Prompt Injection and Jailbreak Attacks on Large Language Models

Cisco Testing Finds Open-Weight LLMs Highly Susceptible to Multi-Turn Jailbreaks

LLM Guardrail Bypass and Prompt Injection Weaknesses