Python tarfile extraction filter symlink escape
CVE-2025-4330 is a high-severity vulnerability in Python's tarfile module. When untrusted tar archives are extracted with TarFile.extractall() or TarFile.extract() using the extraction filter parameter set to "data" or "tar", the intended extraction protections can be bypassed. As described in the provided content, this allows symlink targets to point outside the destination directory and permits modification of some file metadata during extraction. The issue affects the tarfile filter functionality introduced in Python 3.12; Python 3.14 and later are also affected when applications rely on the new default filter behavior, which changed from no filtering to "data".
Are you exposed to this one?
Mallory correlates every CVE against your assets, your vendors, and active adversary campaigns. Know which vulnerabilities matter for you, not just which ones are loud.
Impact, mitigation & remediation
What it means. What to do now. Patch path, mitigations, and the assume-compromise checklist.
Impact
What an attacker gets, and what they’ve been doing with it.
Mitigation
If you can’t patch tonight, do this now.
Remediation
Patch, then assume compromise.
Exploits
1 valid exploit after Mallory filtered fakes, detection scripts, and README-only repos.
Repository contains a working exploit generator for CVE-2025-4517 / CVE-2025-4330 affecting CPython’s tarfile.extractall(filter='data') safety checks. It provides two equivalent implementations (CVE-2025-4517.py and CVE-2025-4517.go) that generate a malicious tar archive (default: backup_99.tar). The exploit abuses a PATH_MAX (4096) overflow edge case in os.path.realpath(strict=False): when the resolved path becomes longer than PATH_MAX, realpath stops resolving symlinks and falls back to string-based path manipulation, causing Python’s tarfile “data” filter to incorrectly conclude a symlink remains within the extraction directory. Exploit structure (both implementations): (1) create 16 nested long-name directories plus 16 short 1-character symlinks (a–p) pointing to the long directories, so the logical path stays short while the resolved path grows; (2) add a 254-character symlink name at the end of the short chain that points back up 16 levels (".." x 16), pushing the resolved path over PATH_MAX; (3) create an 'escape' symlink whose link target traverses the short chain and then includes additional '..' components (DEPTH_TO_ROOT) intended to reach '/', which Python mis-validates due to the realpath fallback; (4) add a regular file entry at 'escape/<TARGET_FILE>' so extraction follows the OS-resolved symlink to write to '/<TARGET_FILE>' (default /root/.ssh/authorized_keys) with attacker-controlled content (default SSH public key placeholder). No network scanning or callbacks are present; the attack vector is delivery of a crafted tar to a workflow that extracts it with a vulnerable Python version. README.md documents affected versions, mechanics, configuration knobs (DEST_DIR/DEPTH_TO_ROOT/TARGET_FILE/PAYLOAD/OUTPUT), and expected post-exploitation outcome (e.g., root SSH access if extraction runs as root).
Affected products & vendors
Products and vendors Mallory has correlated with this vulnerability. Open in Mallory to drill down to specific CPE configurations and version ranges.
Vendor-confirmed product mapping. Mallory continuously reconciles this list against your asset inventory.
Recent activity
17 sources tracked across advisories, community write-ups, and news. New activity surfaces here as Mallory finds it.
The version that knows your environment.
Query your assets running an affected version, and investigate the blast radius.
Every observed campaign linking this CVE to a named adversary.
Malware families riding this exploit, with evidence and IOCs.
YARA, Sigma, Snort, and vendor rules, auto-deployed to your SIEM.
Cross-references every affected SKU, including bundled OEM variants.
Community discussion across Reddit, Mastodon, and other social sources.