The short version: ML model files execute code at load time. Pickle's `__reduce__` runs arbitrary Python on deserialization, and ~45% of popular HuggingFace models still use pickle (CCS 2025). Every major framework has had a deserialization CVE in the last year - PyTorch (CVSS 9.3), Keras (CVSS 9.8), ONNX (CVSS 8.8).
Existing scanners use blocklists - maintain a list of known-dangerous functions, allow everything else. We kept finding gaps:
- *picklescan* (used by HuggingFace): 60+ published GHSAs. We found a CVSS 10.0 universal bypass via `pkgutil.resolve_name()` - one opcode sequence that renders the entire blocklist irrelevant. - *fickling* (Trail of Bits): We found an opcode handler bug where function calls vanish from the AST if you POP the result. Fickling reports `LIKELY_SAFE` on a pickle that spawns a reverse shell.
We also found 4 malicious models currently on HuggingFace that bypass every scanner in their pipeline (VirusTotal, JFrog, ClamAV, picklescan, ModelScan).
ModelAudit takes the opposite approach: allowlist-first. We maintain ~1,500 individually vetted safe globals for ML frameworks, and everything else is flagged. It covers 42+ formats (not just pickle), runs entirely offline, has no ML framework deps, and produces SARIF for CI/CD.
We filed 7 GHSAs total across fickling and picklescan through coordinated disclosure. All fixed by maintainers.
MIT licensed: https://github.com/promptfoo/modelaudit
Happy to answer questions about pickle VM internals, the bypass research, or the scanner architecture.