Skip to main content
Meet us at Black Hat USA 2026— Las Vegas, August 1–6Book a Meeting
Mallory
HighPublic exploit

Python tarfile extraction filter symlink escape

IdentifiersCVE-2025-4330CWE-59

CVE-2025-4330 is a high-severity vulnerability in Python's tarfile module. When untrusted tar archives are extracted with TarFile.extractall() or TarFile.extract() using the extraction filter parameter set to "data" or "tar", the intended extraction protections can be bypassed. As described in the provided content, this allows symlink targets to point outside the destination directory and permits modification of some file metadata during extraction. The issue affects the tarfile filter functionality introduced in Python 3.12; Python 3.14 and later are also affected when applications rely on the new default filter behavior, which changed from no filtering to "data".

Share:
For your environment

Are you exposed to this one?

Mallory correlates every CVE against your assets, your vendors, and active adversary campaigns. Know which vulnerabilities matter for you, not just which ones are loud.

ANALYST BRIEF

Impact, mitigation & remediation

What it means. What to do now. Patch path, mitigations, and the assume-compromise checklist.

Impact

What an attacker gets, and what they’ve been doing with it.

Successful exploitation allows an attacker to bypass tarfile extraction safeguards intended to confine archive contents to the chosen extraction directory. A crafted archive can cause symlink targets to escape the destination directory, leading to unauthorized writes outside the extraction path and integrity impact on the host filesystem. The provided content also states that some file metadata can be modified. The cited CVSS vector indicates high integrity impact with no significant confidentiality or availability impact reflected in that scoring.

Mitigation

If you can’t patch tonight, do this now.

Do not extract untrusted tar archives with the affected tarfile filtering behavior until patched. As a workaround, reject archive links containing parent-directory segments such as ".." before extraction, and more generally validate or block symlinks and hardlinks in untrusted archives. If extraction is unavoidable, implement strict canonical-path validation for all archive members and link targets and use a hardened extraction routine that prevents writes outside the intended destination directory.

Remediation

Patch, then assume compromise.

Upgrade to a fixed Python release once vendor patches are available and apply all relevant security updates for affected downstream products. Review any code paths that extract untrusted tar archives with tarfile.extractall() or TarFile.extract(), especially where filter="data" or filter="tar" is used or where Python 3.14+ default filtering is relied upon. If vendor-fixed versions are available in your environment, deploy them promptly.
PUBLIC EXPLOITS

Exploits

1 valid exploit after Mallory filtered fakes, detection scripts, and README-only repos.

VALID 1 / 1 TOTALView more in app
CVE-2025-4517-tarfile-PATH_MAX-bypassMaturityPoCVerified exploit

Repository contains a working exploit generator for CVE-2025-4517 / CVE-2025-4330 affecting CPython’s tarfile.extractall(filter='data') safety checks. It provides two equivalent implementations (CVE-2025-4517.py and CVE-2025-4517.go) that generate a malicious tar archive (default: backup_99.tar). The exploit abuses a PATH_MAX (4096) overflow edge case in os.path.realpath(strict=False): when the resolved path becomes longer than PATH_MAX, realpath stops resolving symlinks and falls back to string-based path manipulation, causing Python’s tarfile “data” filter to incorrectly conclude a symlink remains within the extraction directory. Exploit structure (both implementations): (1) create 16 nested long-name directories plus 16 short 1-character symlinks (a–p) pointing to the long directories, so the logical path stays short while the resolved path grows; (2) add a 254-character symlink name at the end of the short chain that points back up 16 levels (".." x 16), pushing the resolved path over PATH_MAX; (3) create an 'escape' symlink whose link target traverses the short chain and then includes additional '..' components (DEPTH_TO_ROOT) intended to reach '/', which Python mis-validates due to the realpath fallback; (4) add a regular file entry at 'escape/<TARGET_FILE>' so extraction follows the OS-resolved symlink to write to '/<TARGET_FILE>' (default /root/.ssh/authorized_keys) with attacker-controlled content (default SSH public key placeholder). No network scanning or callbacks are present; the attack vector is delivery of a crafted tar to a workflow that extracts it with a vulnerable Python version. README.md documents affected versions, mechanics, configuration knobs (DEST_DIR/DEPTH_TO_ROOT/TARGET_FILE/PAYLOAD/OUTPUT), and expected post-exploitation outcome (e.g., root SSH access if extraction runs as root).

0xDTCDisclosed Feb 15, 2026gopythonlocal (malicious archive) / supply-chain style: attacker provides crafted tar; exploitation occurs when a vulnerable Python process extracts it with tarfile.extractall(filter='data')
EXPOSURE SURFACE

Affected products & vendors

Products and vendors Mallory has correlated with this vulnerability. Open in Mallory to drill down to specific CPE configurations and version ranges.

VendorProductType
CanonicalNoble-Stemcell-Openstackoperating_system

Vendor-confirmed product mapping. Mallory continuously reconciles this list against your asset inventory.

What this page doesn’t show

The version that knows your environment.

This page is what’s public. Mallory adds the parts that aren’t: which of your assets are affected, which adversaries are exploiting it right now, which detections to deploy, and what to do tonight.
Exposure mapping

Query your assets running an affected version, and investigate the blast radius.

Threat actor evidence

Every observed campaign linking this CVE to a named adversary.

Associated malware

Malware families riding this exploit, with evidence and IOCs.

Detection signatures

YARA, Sigma, Snort, and vendor rules, auto-deployed to your SIEM.

Vendor-by-vendor mapping

Cross-references every affected SKU, including bundled OEM variants.

Social activity4

Community discussion across Reddit, Mastodon, and other social sources.