Security Risks and Privacy Challenges of Large Language Models in AI Systems
Large language models (LLMs) present a dual-use dilemma in cybersecurity, as their capabilities can be leveraged for both defensive and offensive purposes. Security researchers have identified purpose-built malicious LLMs, such as WormGPT and KawaiiGPT, which are designed to facilitate cybercrime by generating convincing phishing content and rapidly producing or modifying malicious code. The thin line between beneficial and harmful use of LLMs is defined largely by developer intent and the presence or absence of ethical safeguards, raising concerns about the proliferation of offensive AI tools in the threat landscape.
In addition to malicious use, LLMs face significant challenges in maintaining privacy and security due to contextual integrity failures and regulatory-driven censorship. Research from Microsoft highlights the need for AI agents to respect contextual privacy norms, as current models may inadvertently leak sensitive information. Meanwhile, the DeepSeek-R1 model demonstrates how geopolitical censorship mechanisms can introduce security flaws, such as insecure code generation and broken authentication, especially when handling politically sensitive prompts. These issues underscore the urgent need for robust privacy controls and security-aware development practices in the deployment of LLM-powered systems.

Get ahead of threats like this
Mallory correlates global threat intelligence with your attack surface — know if you’re exposed before adversaries strike.
How this story unfolded
4 events from the most recent confirmed update back to the earliest known activity.
Unit 42 publishes analysis of malicious LLMs' dual-use risks
Palo Alto Networks Unit 42 published research detailing how malicious LLMs such as WormGPT 4 and KawaiiGPT can support phishing, malware scaffolding, reconnaissance, and ransomware workflows. The report argues that commercialization and democratization of these tools are making AI-enabled cybercrime more scalable.
Commercialized 'WormGPT 4' appears on Telegram and underground forums
After the original WormGPT's emergence, a successor branded 'WormGPT 4' was promoted as a subscription service through Telegram and underground forums. Unit 42 describes this as a sign of commercialization of malicious LLMs and broader access for less-skilled threat actors.
KawaiiGPT 2.5 is identified as a free GitHub-available malicious tool
Unit 42 reports that KawaiiGPT version 2.5 was identified in July 2025 as a freely available tool on GitHub. The model was described as capable of generating spear-phishing lures, lateral-movement scripts, and data-exfiltration code.
WormGPT emerges as a malicious LLM offering offensive capabilities
Unit 42 says WormGPT first emerged in July 2023 as a purpose-built malicious large language model, reportedly based on GPT-J 6B and trained on malicious datasets. It was marketed for criminal use cases such as phishing, malware development, and other offensive tasks with safety guardrails removed.
Related entities
Vulnerabilities, threat actors, malware, products, organizations, and breaches Mallory has linked to this story.
Sources
3 references tracked. Mallory keeps watching after this page renders.
The Dual-Use Dilemma of AI: Malicious LLMs
unit42.paloaltonetworks.com
Open sourceReducing Privacy leaks in AI: Two approaches to contextual integrity
microsoft.com
Open sourceDeepSeek LLM’s geopolitical censorship triggers security flaws
scworld.com
Open sourceSee the full picture, correlated to your attack surface.
Map indicators from this story to your assets and identify affected systems in minutes.
Every observed campaign, victim, and pivot linked to actors named in this story.
Malware, exploits, and IOCs connected to the activity described here.
YARA, Sigma, and Snort rules deployed to your SIEM as soon as they’re published.
Get matching new stories delivered to your team as they break — not the next morning.
Ask questions about this story and take action on the answers.


