Prompt Injection and Trojanized PyPI Package Exposed Secrets in AI Coding Tools
Researchers from Johns Hopkins University showed that a "comment-and-control" prompt injection technique could hijack AI coding agents embedded in GitHub workflows, including Anthropic’s Claude Code Security Review, Google’s Gemini CLI Action, and GitHub Copilot Agent. By planting malicious instructions in untrusted GitHub content such as pull request titles, issue text, comments, and hidden HTML, the attackers made the agents follow attacker-controlled directions and exfiltrate secrets through workflow output or GitHub comments. The demonstrations reportedly exposed Anthropic and Gemini API keys, GitHub tokens, and other credentials available in GitHub Actions runners, and the affected vendors paid bug bounties and patched the issue without issuing CVEs or public advisories.
Separate research also identified a trojanized PyPI package, hermes-px, marketed as a privacy-focused AI proxy but designed to steal user prompts and exfiltrate data, including an altered Claude Code system prompt. Together, the findings show that the most immediate risk around AI developer tooling is not only the model itself but the runtime and integration layer, where untrusted repository content, overprivileged CI/CD workflows, and malicious third-party packages can turn coding assistants into secret-leaking supply-chain footholds. Defenders were urged to minimize agent permissions, restrict secret exposure, rotate credentials, prefer short-lived OIDC tokens, and harden GitHub Actions configurations such as pull_request_target usage.

Get ahead of threats like this
Mallory correlates global threat intelligence with your attack surface — know if you’re exposed before adversaries strike.
How this story unfolded
5 events from the most recent confirmed update back to the earliest known activity.
Further analysis frames AI agent runtime as a CI/CD supply-chain risk
Subsequent reporting emphasized that the core weakness exposed by Comment and Control was in AI agent runtime and CI/CD integrations, especially workflows using pull_request_target, exposed runner secrets, and overly broad permissions. The analysis recommended reducing agent privileges, rotating credentials, using short-lived OIDC tokens, and hardening GitHub Actions settings.
Anthropic, Google, and GitHub quietly patch agent secret-leak issue
After responsible disclosure of the Comment and Control attack, Anthropic, Google, and GitHub remediated the issue and paid bug bounties. According to the reporting, the vendors did not publish CVEs or formal public advisories about the fixes.
Johns Hopkins researchers demonstrate 'Comment and Control' against GitHub AI agents
Researchers led by Aonan Guan showed that malicious instructions embedded in GitHub pull request titles, issue bodies, comments, and hidden HTML comments could hijack AI agents in GitHub workflows. The attack caused Anthropic's Claude Code Security Review, Google's Gemini CLI Action, and GitHub Copilot Agent to follow attacker-controlled instructions and leak secrets from the GitHub Actions environment.
Media report amplifies hermes-px prompt-stealing PyPI package findings
A follow-up security news report highlighted the trojanized PyPI AI proxy campaign, describing how the package stole Claude prompts and exfiltrated data. This reflected broader public reporting of the malicious package after the initial research disclosure.
JFrog discloses trojanized PyPI package hermes-px targeting Claude users
JFrog Security Research reported that the PyPI package hermes-px, presented as a privacy-focused AI proxy, was malicious and stole user prompts while containing an altered Claude Code system prompt. The disclosure identified the package as a supply-chain threat affecting AI tooling users who installed it from PyPI.
Related entities
Vulnerabilities, threat actors, malware, products, organizations, and breaches Mallory has linked to this story.
Sources
4 references tracked. Mallory keeps watching after this page renders.
Three AI coding agents leaked secrets through a single prompt injection. One vendor's system card predicted it | VentureBeat
venturebeat.com
Open sourceAnthropic, Google, Microsoft paid AI bug bounties - quietly
theregister.com
Open sourceTrojanized PyPI AI Proxy Steals Claude Prompt, Exfiltrates Data
gbhackers.com
Open sourcehermes-px: The 'Privacy' AI Proxy That Steals Your Prompts, Containing Altered Claude Code System Prompt - JFrog Security Research
research.jfrog.com
Open sourceSee the full picture, correlated to your attack surface.
Map indicators from this story to your assets and identify affected systems in minutes.
Every observed campaign, victim, and pivot linked to actors named in this story.
Malware, exploits, and IOCs connected to the activity described here.
YARA, Sigma, and Snort rules deployed to your SIEM as soon as they’re published.
Get matching new stories delivered to your team as they break — not the next morning.
Ask questions about this story and take action on the answers.


