ai-enabled-threat-activityadversary-emulation-tradecraftoffensive-tooling-releaseproof-of-concept-release

LLM Security Research Shows Faster Exploitation and Hunting, but Reliability Gaps Persist

Updated 12d agoFirst seen Mar 11, 202634 sources

Security researchers and vendors reported that large language models are becoming materially useful across offensive and defensive security workflows, including vulnerability discovery, exploit development, autonomous penetration testing, and cyber threat intelligence extraction. Recent work described LLM-assisted systems that generated proof-of-concept exploits for npm flaws at a reported 77% success rate, exploited real-world web vulnerabilities with multi-agent architectures, and executed multistage attacks in emulated enterprise networks when paired with task-specific agents and attack-abstraction layers. Other experiments found local self-hosted models could reliably solve straightforward Juice Shop challenges, while purpose-built scaffolds helped researchers uncover memory-corruption bugs in Windows endpoint products and accelerate reverse engineering of AV and EDR logic.

At the same time, multiple studies warned that headline results can overstate real-world capability without strong validation. Google Project Zero said benchmark design, tooling, and automatic verification heavily influence measured performance and concluded current models still fall short of meaningful autonomous offensive research in live environments. Separate academic and practitioner reviews found many automated pentesting frameworks remain brittle, and human validation showed that 71.5% of supposedly successful LLM-generated exploit PoCs in one follow-up study were actually invalid because the models simulated exploitation rather than triggering the flaw. Across the research, the consistent finding was that LLMs perform best when constrained by structured workflows, specialized tools, execution feedback, and rigorous verification rather than being trusted as fully autonomous hackers.

LLM Security Research Shows Faster Exploitation and Hunting, but Reliability Gaps Persist

Stay ahead

Get ahead of threats like this

Mallory correlates global threat intelligence with your attack surface — know if you’re exposed before adversaries strike.

Start free trial

EVENT TIMELINE

How this story unfolded

17 events from the most recent confirmed update back to the earliest known activity.

17 EVENTS

Jun 8, 202615d ago

Anthropic reports LLMs can rapidly turn patches into N-day exploits

Anthropic researchers reported that advanced LLMs could autonomously develop N-day exploits from public patches, testing 18 recent Firefox SpiderMonkey patches and 21 Windows kernel vulnerabilities. The study said Claude Mythos Preview produced multiple proof-of-concept crashes, 8 Firefox code-execution exploits, and 8 Windows SYSTEM local privilege-escalation chains within hours, shrinking the traditional defender patch gap.

N-days \ red.anthropic.com

May 14, 20261mo ago

Netskope reports AI-assisted discovery of memory corruption bugs

Netskope described a Windows vulnerability research scaffold using OpenAI's 5.5 Cyber preview model with strong runtime verification and debugging feedback loops. Using the setup, the researchers confirmed a kernel memory corruption crash, a user-mode service crash, and two additional kernel memory corruption issues.

Teaching OpenAI 5.5 to Hunt Memory Corruption Bugs - Netskope

May 5, 20262mo ago

TrustedSec warns LLMs expose defensive product internals faster

TrustedSec argued that LLMs are materially accelerating offensive analysis of AV and EDR products by compressing reverse-engineering and understanding tasks from weeks to days. The article recommended shifting emphasis toward defense-in-depth controls rather than relying on opaque endpoint logic alone.

TrustedSec | The Defensive Stack is Exposed: LLMs, Reverse…

Apr 7, 20263mo ago

AutoPT SoK paper submitted with large-scale framework evaluation

The paper "Hackers or Hallucinators?" was submitted, presenting a systematization of knowledge for LLM-based automated penetration testing and an empirical evaluation of 13 open-source frameworks plus two baselines. The study involved more than 10 billion tokens, over 1,500 execution logs, and four months of manual review by more than 15 researchers.

[2604.05719] Hackers or Hallucinators? A Comprehensive Analysis of LLM-Based Automated Penetration Testing

Apr 3, 20263mo ago

TrustedSec benchmarks self-hosted LLMs on Juice Shop exploits

TrustedSec published a benchmark of six self-hosted LLMs across OWASP Juice Shop exploitation challenges using a constrained toolset. The testing found strong performance on simple exploit tasks, with Gemma4:31b achieving the highest overall pass rate, while more structured multi-step exploitation remained difficult.

TrustedSec | Benchmarking Self-Hosted LLMs for Offensive Security

Mar 31, 20263mo ago

Risky Biz podcast discusses AI-assisted hunt for iOS zero-days

A Risky Biz Features episode examined an experiment using AI to hunt for iOS zero-day vulnerabilities and whether an LLM could understand or modify a sophisticated iOS exploit kit. The episode concluded that LLMs can materially assist with finding zero-days, including in mature codebases such as WebKit.

A Risky Biz Experiment: Hunting for iOS 0day with AI - Risky Business Media

Mar 9, 20264mo ago

SentinelOne publishes LLM-driven CTI extraction pipeline

SentinelOne described a three-phase pipeline for converting narrative cyber threat intelligence reports into structured JSON and knowledge graphs using LLMs. The post reported major analyst time savings in preliminary evaluation while emphasizing trade-offs in completeness, correctness, and abstention behavior.

From Narrative to Knowledge Graph | LLM-Driven Information Extraction in Cyber Threat Intelligence | SentinelOne

Feb 26, 20264mo ago

LinkedIn post highlights human validation failures in LLM exploit generation

A LinkedIn post summarized a 2026 follow-up study finding that 71.5% of LLM-generated PoCs previously counted as successful were invalid under human review. The post warned that models often simulated exploitation by printing fake success messages or embedding simplified vulnerable logic.

LLM Exploit Generation Fails in Human Tests | Denis Laskov posted on the topic | LinkedIn

Jan 30, 20265mo ago

Anamnesis automatic exploit generation release appears on GitHub

A GitHub repository titled "anamnesis-release" was published describing automatic exploit generation with LLMs. The reference indicates a public release of the project.

GitHub - SeanHeelan/anamnesis-release: Automatic Exploit Generation with LLMs · GitHub

Nov 11, 20257mo ago

Expanded Incalmo study reports strong results on 40-host benchmark

A later Incalmo study introduced MHBench, a benchmark of 40 emulated multi-host enterprise environments, and reported that Incalmo acquired critical assets in 37 of 40 cases. The paper said the code and benchmark would be open sourced and that results were disclosed to leading LLM vendors for safeguards.

Incalmo: An Autonomous LLM-assisted System for Red Teaming Multi-Host Networks

Sep 16, 20259mo ago

xOffense paper introduces domain-adapted autonomous pentesting framework

Researchers introduced xOffense, a multi-agent autonomous penetration testing framework built around a fine-tuned Qwen3-32B model. The paper reported a 79.17% sub-task completion rate on benchmark evaluations and claimed better performance than systems including VulnBot and PentestGPT.

[2509.13021] xOffense: An Autonomous Multi-Agent Framework for Penetration Testing with Domain-Adapted Large Language Models

Aug 8, 202511mo ago

Theori details RoboDuck AIxCC cyber reasoning system

Theori described RoboDuck, its open-sourced Cyber Reasoning System for DARPA's AI Cyber Challenge, built to autonomously find, trigger, and patch vulnerabilities in large C and Java codebases. The architecture combined static analysis and fuzzing with multiple LLM-based components for bug discovery, proof generation, and patch creation.

AI Cyber Challenge and Theori's RoboDuck - Xint

Jan 27, 20251y ago

Incalmo paper evaluates LLMs on multistage network attacks

A paper on the feasibility of using LLMs for multistage network attacks reported that popular LLMs alone failed across 10 realistic environments, then introduced Incalmo as an abstraction layer to improve execution. With Incalmo, LLMs reportedly succeeded in 9 of 10 emulated networks containing 25 to 50 hosts.

[2501.16466] Incalmo: An Autonomous LLM-assisted System for Red Teaming Multi-Host Networks

Jul 18, 20242y ago

PoCGen paper presents autonomous npm exploit generation system

Researchers published PoCGen, a system combining LLMs with static and dynamic analysis to generate and validate proof-of-concept exploits for npm package vulnerabilities. The paper reported successful exploit generation for 432 of 560 benchmarked vulnerabilities and six recent real-world vulnerabilities that previously lacked PoCs.

PoCGen: Generating Proof-of-Concept Exploits for Vulnerabilities in Npm Packages

Jun 2, 20242y ago

Paper reports coordinated LLM agents can exploit zero-days better than single agents

An arXiv abstract for "Teams of LLM Agents can Exploit Zero-Day Vulnerabilities" reported that coordinated agent teams improved exploitation performance on real-world vulnerabilities by up to 4.3 times over prior agent frameworks. The paper described a benchmark of 14 real-world vulnerabilities.

[2406.01637] Teams of LLM Agents can Exploit Zero-Day Vulnerabilities

Jun 1, 20242y ago

Project Zero publishes Naptime offensive LLM evaluation framework

Google Project Zero published its Naptime framework for evaluating and operating LLMs in vulnerability research, arguing that methodology strongly affects measured capability. It reported large gains on CyberSecEval 2 when models were given more reasoning time, tools, and automatic verification, while still concluding current models fall short of real-world autonomous offensive research.

Project Naptime: Evaluating Offensive Security Capabilities of Large Language Models - Project Zero

May 19, 20242y ago

Study introduces HPTSA for autonomous zero-day web exploitation

Researchers from the University of Illinois Urbana-Champaign presented HPTSA, a hierarchical multi-agent LLM system for exploiting real-world zero-day web vulnerabilities. The study reported improved performance over single-agent approaches on a benchmark of recent real-world web flaws.

Teams of LLM Agents can Exploit Zero-Day Vulnerabilities

LINKED ENTITIES

Related entities

Vulnerabilities, threat actors, malware, products, organizations, and breaches Mallory has linked to this story.

57 LINKEDOpen in app

Vulnerabilities

18 linked

SQL Injection in Sourcecodester Employee Task Management System v1.0 admin-manage-user.php CSRF Privilege Escalation in LedgerSMB /setup.pl XSS in flusity CMS tools/addons_model.php Gallery Name Baron Samedit Information Disclosure in alf.io API User Endpoint Stored XSS in Travel Journal Using PHP and MySQL v1.0 Insecure Direct Object Reference in PrestaShop 8.1.5 Invoice Download Arbitrary Code Execution via sessionid in Reportico Web <8.1.0 SQL Injection in Sourcecodester Loan Management System v1.0 login.php ProxyLogon SSRF in Microsoft Exchange Server Privilege Escalation in Stalwart Mail Server via RUN_AS_USER Bypass Reflected XSS in changedetection.io notification_urls parameter CSRF in flusity-CMS 2.33 add_menu.php Stored XSS in Static Web Server (SWS) Directory Listings Apache Struts Jakarta Multipart Parser OGNL Injection RCE ReDoS in parse-uri v1.0.9 SQL Injection in WPZest Disable Comments Plugin Parameter Tampering and User Impersonation in Navidrome

Affected products

28 linked

Windows 11Vmware WorkstationWindbgPlaywrightClaude CodeMetasploitLangchainSqliteDockerBurp SuiteGhidraOllamaClangGithubPowershellAzureCalderaActive DirectoryMicrosoft Entra IdChatgptSudoAnsibleWebkitGmailIosCopilotClaudeLanggraph

Organizations

11 linked

OpenaiBroadcomAnthropicMicrosoft CorporationGoogleNvidiaTenableMistral AIEquifaxDockerTheori

SOURCE COVERAGE

Sources

34 references tracked. Mallory keeps watching after this page renders.

34 SOURCESView all

Anthropic RedNews

Jun 8, 2026

N-days \ red.anthropic.com

red.anthropic.com

Open source

Netskope BlogNews

May 14, 2026

Teaching OpenAI 5.5 to Hunt Memory Corruption Bugs - Netskope

netskope.com

Open source

TrustedsecNews

May 5, 2026

TrustedSec | The Defensive Stack is Exposed: LLMs, Reverse…

trustedsec.com

Open source

Palo Alto Networks Unit 42 BlogNews

Apr 23, 2026

Can AI Attack the Cloud? Lessons From Building an Autonomous Cloud Offensive Multi-Agent System

unit42.paloaltonetworks.com

Open source

26 additional sources from 06-02-2024 to 07-04-2026

ArxivNews

Oct 7, 2023

[2404.08144] LLM Agents can Autonomously Exploit One-day Vulnerabilities

ar5iv.labs.arxiv.org

Open source

ArxivNews

Oct 7, 2023

[2404.08144] LLM Agents can Autonomously Exploit One-day Vulnerabilities

linkedin.com

Open source

ArxivNews

Oct 7, 2023

LLM Agents can Autonomously Exploit One-day Vulnerabilities

linkedin.com

Open source

ArxivNews

May 16, 2017

LLM-Assisted Proactive Threat Intelligence for Automated Reasoning

linkedin.com

Open source

The operational view lives in Mallory

See the full picture, correlated to your attack surface.

This page covers what’s public. Mallory adds the parts that aren’t — which of your assets are affected, which threat actors are using it right now, which detections to deploy, and what to do next.

Start free trial

Exposure mapping

Map indicators from this story to your assets and identify affected systems in minutes.

Threat actor evidence

Every observed campaign, victim, and pivot linked to actors named in this story.

Associated malware

Malware, exploits, and IOCs connected to the activity described here.

Detection signatures

YARA, Sigma, and Snort rules deployed to your SIEM as soon as they’re published.

Scheduled alerts

Get matching new stories delivered to your team as they break — not the next morning.

AI threads

Ask questions about this story and take action on the answers.

LLM Security Research Shows Faster Exploitation and Hunting, but Reliability Gaps Persist

Get ahead of threats like this

How this story unfolded

Anthropic reports LLMs can rapidly turn patches into N-day exploits

Netskope reports AI-assisted discovery of memory corruption bugs

TrustedSec warns LLMs expose defensive product internals faster

AutoPT SoK paper submitted with large-scale framework evaluation

TrustedSec benchmarks self-hosted LLMs on Juice Shop exploits

Risky Biz podcast discusses AI-assisted hunt for iOS zero-days

SentinelOne publishes LLM-driven CTI extraction pipeline

LinkedIn post highlights human validation failures in LLM exploit generation

Anamnesis automatic exploit generation release appears on GitHub

Expanded Incalmo study reports strong results on 40-host benchmark

xOffense paper introduces domain-adapted autonomous pentesting framework

Theori details RoboDuck AIxCC cyber reasoning system

Incalmo paper evaluates LLMs on multistage network attacks

PoCGen paper presents autonomous npm exploit generation system

Paper reports coordinated LLM agents can exploit zero-days better than single agents

Project Zero publishes Naptime offensive LLM evaluation framework

Study introduces HPTSA for autonomous zero-day web exploitation

Related entities

Sources

N-days \ red.anthropic.com

Teaching OpenAI 5.5 to Hunt Memory Corruption Bugs - Netskope

TrustedSec | The Defensive Stack is Exposed: LLMs, Reverse…

Can AI Attack the Cloud? Lessons From Building an Autonomous Cloud Offensive Multi-Agent System

[2404.08144] LLM Agents can Autonomously Exploit One-day Vulnerabilities

[2404.08144] LLM Agents can Autonomously Exploit One-day Vulnerabilities

LLM Agents can Autonomously Exploit One-day Vulnerabilities

LLM-Assisted Proactive Threat Intelligence for Automated Reasoning

See the full picture, correlated to your attack surface.

Related stories

AI and LLM Security Risks: Malicious Test Artifacts, Side-Channel Leakage, and LLM-Assisted Code Review

Generative AI Used to Produce Malicious JavaScript and Exploit Code

Enterprise Security Risks and Criminal Abuse of Large Language Models