AI Security Risks and Emerging Tooling for Testing LLMs and Agentic Systems
Security reporting and vendor research highlighted accelerating AI/LLM security exposure as enterprises deploy generative AI and autonomous agents faster than defensive controls mature. Commonly cited weaknesses included prompt injection (reported as succeeding against a majority of tested LLMs), training-data poisoning, malicious packages in model repositories, and real-world deepfake-enabled fraud; one example referenced prior disclosure that a China-linked actor weaponized an autonomous coding/agent tool by breaking malicious objectives into benign-looking subtasks. Separately, commentary on AppSec programs argued that AI-assisted development is amplifying alert volumes and making traditional SAST triage increasingly impractical, pushing organizations toward more runtime and workflow-embedded testing approaches.
New and emerging tooling and practices are being positioned to address these risks, including an open-source scanner (Augustus, by Praetorian) that automates 210+ adversarial test techniques across 28 LLM providers as a portable Go binary intended for CI/CD and red-team workflows, and discussion of autonomous AI pentesting tools (e.g., Shannon) that require sensitive inputs such as source code, repo context, and API keys—raising governance and data-handling concerns even when used defensively. Several other items in the set (phishing/XWorm activity, healthcare extortion group “Insomnia,” Singapore telco intrusions attributed to UNC3886, and help-desk payroll fraud) describe unrelated threat activity and do not materially change the AI-security-focused picture.

Get ahead of threats like this
Mallory correlates global threat intelligence with your attack surface — know if you’re exposed before adversaries strike.
How this story unfolded
3 events from the most recent confirmed update back to the earliest known activity.
Anthropic releases Claude Opus 4.6 with cyber-misuse detection layer
Anthropic released Claude Opus 4.6 and added a new detection layer intended to identify and respond to cyber misuse of Claude. The release was noted in reporting about growing risks from autonomous and agentic AI security tools.
Cisco Talos discloses UAT-9921 and its VoidLink framework
Cisco Talos reported discovering a new threat actor, UAT-9921, using the advanced VoidLink framework to target primarily Linux systems. Talos said the framework supports modular plugins and evasion features, and published Snort and ClamAV detections to help defenders identify related activity.
Praetorian releases Augustus open-source LLM vulnerability scanner
Praetorian released Augustus, an open-source LLM security scanner designed for production testing. The Go-based tool supports 210+ adversarial probes across 47 attack categories and 28 LLM providers, with result export and probe-transformation features for evaluating guardrail bypasses.
Related entities
Vulnerabilities, threat actors, malware, products, organizations, and breaches Mallory has linked to this story.
Sources
4 references tracked. Mallory keeps watching after this page renders.
The Future of DAST: Why Runtime Security Testing Matters in the AI Era | Cybersecurity Dive
stackhawk.com
Open sourceThese 4 critical AI vulnerabilities are being exploited faster than defenders can respond | ZDNET
zdnet.com
Open sourceHand over the keys for Shannon’s shenanigans
blog.talosintelligence.com
Open sourceAugustus - Open-source LLM Vulnerability Scanner With 210+ Attacks Across 28 LLM Providers
cybersecuritynews.com
Open sourceSee the full picture, correlated to your attack surface.
Map indicators from this story to your assets and identify affected systems in minutes.
Every observed campaign, victim, and pivot linked to actors named in this story.
Malware, exploits, and IOCs connected to the activity described here.
YARA, Sigma, and Snort rules deployed to your SIEM as soon as they’re published.
Get matching new stories delivered to your team as they break — not the next morning.
Ask questions about this story and take action on the answers.


