Skip to main content
Meet us at Black Hat USA 2026— Las Vegas, August 1–6Book a Meeting
Mallory
Back to intelligence
ai-enabled-threat-activityadversary-emulation-tradecraftoffensive-tooling-releaseautonomous-system-security

GPT-5.5 Matches or Surpasses Mythos in Offensive Cybersecurity Tests

Updated 2d agoFirst seen May 1, 20263 sources

The UK’s AI Security Institute found that OpenAI’s GPT-5.5 performed at or above the level of Anthropic’s Mythos Preview in offensive cybersecurity evaluations, underscoring a sharp rise in frontier-model cyber capability. In an AISI benchmark built around 95 Capture The Flag challenges, GPT-5.5 reportedly led overall on advanced tasks with a 71.4% score, ahead of Mythos Preview at 68.6% and GPT-5.4 at 52.4%. The tested scenarios included realistic security work such as firmware analysis without source code, exploitation of memory-corruption bugs including use-after-free, cryptographic bypass, attacks on weak random-number generators, and analysis of obfuscated malware.

The results also showed limits: Mythos Preview reportedly remained stronger in longer, multi-step intrusion simulations, including a 32-step exercise known as “The Last Ones,” and neither model completed a simulated industrial-system attack. Researchers and commentators said the findings point to rapid gains driven by greater autonomy, stronger reasoning, and improved coding ability, increasing concern over near-term misuse. Reports differed on deployment details, with some noting restrictions on GPT-5.5’s full public availability while others described the model as generally available, but the evaluations broadly agreed that leading AI systems are now highly capable at vulnerability discovery and offensive security tasks.

Share:
GPT-5.5 Matches or Surpasses Mythos in Offensive Cybersecurity Tests
Stay ahead

Get ahead of threats like this

Mallory correlates global threat intelligence with your attack surface — know if you’re exposed before adversaries strike.

EVENT TIMELINE

How this story unfolded

3 events from the most recent confirmed update back to the earliest known activity.

3 EVENTS
May 13, 20262mo ago

GPT-5.5 reported as generally available

A later report stated that GPT-5.5 is generally available. This was presented alongside discussion of its vulnerability-finding performance relative to Mythos.

OpenAI's GPT-5.5 is as Good as Mythos at Finding Security Vulnerabilities - Schneier on Security
May 4, 20262mo ago

AISI warns rising AI cyber capability may drive misuse concerns

AISI interpreted the benchmark results as evidence that AI cyber capabilities are increasing rapidly due to greater autonomy, stronger reasoning, and improved coding ability. The reported findings raised concerns about near-term misuse and led to restrictions on GPT-5.5's full public availability.

Cybersécurité et IA : GPT-5.5 surclasse déjà Mythos et change l'é ...

AISI evaluates GPT-5.5 on offensive cybersecurity benchmarks

The UK AI Security Institute evaluated GPT-5.5 using an offensive cybersecurity benchmark built around 95 Capture The Flag challenges and longer multi-step intrusion simulations. The testing found GPT-5.5 leading on the benchmark overall, while Mythos Preview retained an edge in some extended attack simulations.

Cybersécurité et IA : GPT-5.5 surclasse déjà Mythos et change l'é ...
LINKED ENTITIES

Related entities

Vulnerabilities, threat actors, malware, products, organizations, and breaches Mallory has linked to this story.

3 LINKEDOpen in app
Organizations
3 linked
AnthropicOpenaiGoogle
The operational view lives in Mallory

See the full picture, correlated to your attack surface.

This page covers what’s public. Mallory adds the parts that aren’t — which of your assets are affected, which threat actors are using it right now, which detections to deploy, and what to do next.
Exposure mapping

Map indicators from this story to your assets and identify affected systems in minutes.

Threat actor evidence

Every observed campaign, victim, and pivot linked to actors named in this story.

Associated malware

Malware, exploits, and IOCs connected to the activity described here.

Detection signatures

YARA, Sigma, and Snort rules deployed to your SIEM as soon as they’re published.

Scheduled alerts

Get matching new stories delivered to your team as they break — not the next morning.

AI threads

Ask questions about this story and take action on the answers.