GPT-5.5 Matches or Surpasses Mythos in Offensive Cybersecurity Tests

EVENT TIMELINE

How this story unfolded

3 events from the most recent confirmed update back to the earliest known activity.

3 EVENTS

May 13, 20262mo ago

GPT-5.5 reported as generally available

A later report stated that GPT-5.5 is generally available. This was presented alongside discussion of its vulnerability-finding performance relative to Mythos.

OpenAI's GPT-5.5 is as Good as Mythos at Finding Security Vulnerabilities - Schneier on Security

May 4, 20262mo ago

AISI warns rising AI cyber capability may drive misuse concerns

AISI interpreted the benchmark results as evidence that AI cyber capabilities are increasing rapidly due to greater autonomy, stronger reasoning, and improved coding ability. The reported findings raised concerns about near-term misuse and led to restrictions on GPT-5.5's full public availability.

Cybersécurité et IA : GPT-5.5 surclasse déjà Mythos et change l'é ...

AISI evaluates GPT-5.5 on offensive cybersecurity benchmarks

The UK AI Security Institute evaluated GPT-5.5 using an offensive cybersecurity benchmark built around 95 Capture The Flag challenges and longer multi-step intrusion simulations. The testing found GPT-5.5 leading on the benchmark overall, while Mythos Preview retained an edge in some extended attack simulations.

Cybersécurité et IA : GPT-5.5 surclasse déjà Mythos et change l'é ...

LINKED ENTITIES

Related entities

Vulnerabilities, threat actors, malware, products, organizations, and breaches Mallory has linked to this story.

3 LINKEDOpen in app

Organizations

3 linked

AnthropicOpenaiGoogle

SOURCE COVERAGE

Sources

3 references tracked. Mallory keeps watching after this page renders.

3 SOURCESView all

Schneier On SecurityNews

May 13, 2026

OpenAI's GPT-5.5 is as Good as Mythos at Finding Security Vulnerabilities - Schneier on Security

schneier.com

Open source

ZdnetNews

May 4, 2026

Cybersécurité et IA : GPT-5.5 surclasse déjà Mythos et change l'é ...

zdnet.fr

Open source

UnclassifiedNews

May 1, 2026

Amid Mythos' Hyped Cybersecurity Prowess, Researchers Find GPT-5.5 Is Just as Good

ground.news

Open source

ON THE SAME THREAD

Anthropic’s **Claude Mythos Preview** has been rolled out to a limited set of major technology and infrastructure firms under **Project Glasswing**, where partners including Amazon, Apple, Microsoft, Cisco, CrowdStrike, Palo Alto Networks, Broadcom, and the Linux Foundation are using it for defensive security work such as vulnerability discovery in first-party and open-source software. Anthropic said the model found **thousands of high-severity and zero-day vulnerabilities**, while Mozilla reported Mythos identified **271 Firefox flaws**, far above the 22 bugs previously found with an earlier Anthropic model. Mozilla said the findings suggest AI-assisted code reasoning is approaching the breadth of elite human vulnerability research, even if the bugs themselves were still understandable to human experts. Separate findings from the UK’s **AI Security Institute (AISI)** indicate that frontier models’ autonomous cyber capabilities are advancing faster than earlier forecasts and that common evaluation methods may be understating their performance. AISI reported the `80%`-reliability cyber time horizon has been doubling every **4.7 months** since late 2024, with Claude Mythos Preview and `GPT-5.5` outperforming that trend, and said Mythos became the first model to complete both evaluated enterprise cyber ranges, including the previously unsolved industrial-control scenario **Cooling Tower**. AISI also found that larger inference budgets—up to **50 million tokens** or **1,000 turns**—unlock materially higher success rates on difficult cyber tasks, reinforcing concerns that increasingly capable AI systems could strengthen defenders while also lowering the barrier to advanced offensive cyber operations if access is not tightly controlled.

Jun 29, 2026

OpenAI Daybreak and Anthropic Glasswing debut amid rapid gains in AI cyber capability

OpenAI launched **Daybreak**, a cybersecurity initiative built around `GPT-5.5`, a tiered **Trusted Access for Cyber** framework, and **Codex Security** as its agent harness, positioning it alongside Anthropic’s earlier **Project Glasswing**. Both programs advertise similar defender-focused uses, including vulnerability discovery, exploit validation in authorized environments, patch validation, and detection engineering, while differing mainly in governance and access controls. Major security vendors including **Cisco**, **CrowdStrike**, and **Palo Alto Networks** joined both efforts, signaling that large defenders are pursuing model-agnostic strategies rather than committing to a single AI provider. The launches come as the UK AI Security Institute reported that autonomous AI cyber capability is advancing quickly, with the 80%-reliability cyber task time horizon previously doubling every **4.7 months** and new results from **Claude Mythos Preview** and `GPT-5.5` exceeding that trend. A newer Claude Mythos Preview checkpoint became the first model to complete both AISI cyber ranges, including the previously unsolved **Cooling Tower**, while `GPT-5.5` also posted strong results on **The Last Ones**; AISI said benchmark differences between the leading models are now narrow enough that practical differentiation is shifting toward agent harnesses, access restrictions, auditability, and partner ecosystems. The institute warned that current tests likely understate real-world capability and urged organizations to strengthen security baselines as both the defensive value and the cyber risk of frontier models increase.

Jun 29, 2026

Anthropic Restricts Mythos AI After It Finds and Exploits Thousands of Software Flaws

Anthropic unveiled **Claude Mythos Preview**, an unreleased AI model it says can autonomously discover and exploit severe software vulnerabilities across major operating systems, browsers, open-source projects, and some closed-source targets. The company said the model uncovered thousands of high-severity flaws, including long-lived bugs in **OpenBSD**, **FFmpeg**, Linux kernel privilege-escalation chains, and **`CVE-2026-4747`**, a FreeBSD NFS remote code execution flaw that could enable unauthenticated root access. Anthropic withheld broad release, citing offensive cyber risk, and instead launched **Project Glasswing**, a gated program for roughly 40 to 50 partners such as AWS, Apple, Cisco, Cloudflare, Google, JPMorgan Chase, Microsoft, Mozilla, NVIDIA, and Palo Alto Networks to validate findings, patch affected software, and study defensive uses. Independent and industry assessments broadly agreed Mythos marks a significant advance in AI-enabled cyber capability, though several researchers questioned how much of Anthropic’s headline claims can yet be verified through public CVEs and warned that similar results may be reproducible with cheaper or open models plus strong tooling. The UK AI Security Institute found Mythos achieved a **73%** success rate on expert capture-the-flag tasks and completed a full 32-step simulated enterprise attack in 3 of 10 runs, while Anthropic later reported coordinated disclosure activity spanning **1,596 vulnerabilities across 281 open-source projects** and partners identifying more than **10,000** high- or critical-severity candidates. Governments, financial regulators, and CISO groups in the US, UK, Europe, Canada, and Japan responded with briefings and warnings that AI is compressing the gap between vulnerability discovery and weaponization, leaving remediation, patch governance, and defensive automation as the main bottlenecks.

Jun 29, 2026

GPT-5.5 Matches or Surpasses Mythos in Offensive Cybersecurity Tests

Get ahead of threats like this

How this story unfolded

GPT-5.5 reported as generally available

AISI warns rising AI cyber capability may drive misuse concerns

AISI evaluates GPT-5.5 on offensive cybersecurity benchmarks

Related entities

Sources

OpenAI's GPT-5.5 is as Good as Mythos at Finding Security Vulnerabilities - Schneier on Security

Cybersécurité et IA : GPT-5.5 surclasse déjà Mythos et change l'é ...

Amid Mythos' Hyped Cybersecurity Prowess, Researchers Find GPT-5.5 Is Just as Good

See the full picture, correlated to your attack surface.

GPT-5.5 Matches or Surpasses Mythos in Offensive Cybersecurity Tests

Get ahead of threats like this

How this story unfolded

GPT-5.5 reported as generally available

AISI warns rising AI cyber capability may drive misuse concerns

AISI evaluates GPT-5.5 on offensive cybersecurity benchmarks

Related entities

Sources

OpenAI's GPT-5.5 is as Good as Mythos at Finding Security Vulnerabilities - Schneier on Security

Cybersécurité et IA : GPT-5.5 surclasse déjà Mythos et change l'é ...

Amid Mythos' Hyped Cybersecurity Prowess, Researchers Find GPT-5.5 Is Just as Good

See the full picture, correlated to your attack surface.

Related stories

Anthropic Mythos Raises Alarm With Rapid Gains in AI Cyber Capability

OpenAI Daybreak and Anthropic Glasswing debut amid rapid gains in AI cyber capability

Anthropic Restricts Mythos AI After It Finds and Exploits Thousands of Software Flaws