Growing Use of LLMs to Automate Offensive Security and Threat Intelligence Workflows
Multiple security researchers and vendors reported rapid adoption of LLM-driven automation across both offensive and defensive security workflows, with a focus on turning traditionally manual, expert-led tasks into semi- or fully-automated pipelines. Black Lantern Security described how “agentic” LLM tooling is being positioned as a terminal-native partner for offensive security engineers, potentially orchestrating common testing stacks and accelerating repetitive penetration testing activities, while also introducing new operational and safety challenges.
On the defensive side, SentinelOne detailed using LLMs to extract and contextualize data from narrative cyber threat intelligence (CTI) reporting, converting unstructured prose into structured entities/relationships (e.g., IOCs and inferred links) for downstream detection and response workflows, and discussed trade-offs versus non-LLM pattern-matching approaches. Separately, an independent researcher described using LLMs for vulnerability research end-to-end—claiming discovery of multiple real-world vulnerabilities without manual source review—by applying AI-assisted techniques such as differential and grammar-based fuzzing and automated harness generation against widely used projects (e.g., Parse Server, HonoJS, ElysiaJS).
Sources
Related Stories

Practical Guidance on Using LLMs in Security Work and Testing LLM Applications
NVISO published a technical introduction on **automating LLM red teaming** to find security weaknesses in LLM-based applications, focusing on AI-specific risks such as **prompt injection**, **data leakage**, **jailbreaking**, and other behaviors that can bypass guardrails. The post describes why manual testing is difficult due to LLMs’ probabilistic behavior and demonstrates using the *promptfoo* CLI to scale testing against a deliberately vulnerable *ChainLit* application, positioning automated test harnesses as a way to systematically probe LLM apps for exploitable failure modes. Separately, a practitioner write-up describes how security analysts and engineers are using general-purpose LLM tools (*Claude*, *Cursor*, *ChatGPT*) to accelerate day-to-day security work through better prompting patterns rather than “keyword searching.” It provides practical prompting techniques (e.g., “role-stacking” and supplying richer context like requirements docs or code repositories) and includes an example of using an LLM to help design a small Flask application for collecting OSINT (DNS, WHOIS/RDAP, HTML) for URL investigations—guidance that is adjacent to, but not the same as, automated red-teaming of LLM applications.
1 months agoEmergence of LLM-Enabled Malware and Defensive Innovations
Security researchers have identified a new wave of threats where adversaries embed Large Language Model (LLM) capabilities directly into malware, enabling malicious code to be generated at runtime and evading traditional detection methods. SentinelLABS highlighted real-world cases such as PromptLock ransomware and APT28’s LameHug/PROMPTSTEAL campaigns, noting that while these threats are adaptive, they often hardcode artifacts like API keys and prompts, which can be leveraged for detection. Novel hunting strategies, including YARA rules for API key structures and prompt detection, have uncovered thousands of LLM-enabled malware samples, including previously unknown threats like MalTerminal. In parallel, security vendors are leveraging LLMs defensively, as seen in NodeZero’s Advanced Data Pilfering (ADP) feature, which uses LLMs to identify hidden credentials and assess the business risk of compromised data. By applying semantic analysis to unstructured data, defenders can better understand what attackers might target and how to prioritize response. These developments underscore both the offensive and defensive potential of LLMs in cybersecurity, with attackers and defenders racing to exploit the technology’s unique capabilities.
4 months ago
AI and LLM Security Risks: Malicious Test Artifacts, Side-Channel Leakage, and LLM-Assisted Code Review
Security researchers highlighted multiple ways **LLM adoption can introduce or amplify risk**, including both technical attacks and unsafe development practices. G DATA reported that a Git-hosted “detector” for the **Shai-Hulud worm** shipped with “test files” that were effectively *real malware*: scripts capable of deleting user directories and, in at least one case, uploading data to actual threat actors. The files were apparently intended to validate detection efficacy and may have been produced via AI-assisted “vibe coding,” where the model replicated malicious behavior one-to-one while comments claimed the code was only a simulation; although the test artifacts are not executed during normal tool operation, users could trigger damage by manually running them. Separate academic work summarized by Bruce Schneier described **side-channel attacks against LLM inference**, where data-dependent timing and token/packet-size patterns (including those introduced by efficiency techniques like speculative decoding) can leak information about user prompts even over encrypted channels. Reported impacts include inferring conversation topics with high accuracy and, in some settings, recovering sensitive data such as phone numbers or credit card numbers via active probing. In parallel, an SC Media segment discussed the operational upside of **LLM-driven secure code analysis**, citing results that improved security across hundreds of open-source projects but noting the importance of human validation and patching effort; an OSINT Team post provided a cautionary, practitioner-level example of how easily malware can be accidentally executed during analysis, reinforcing the need for disciplined handling and isolation when working with suspicious files.
4 weeks ago