I’m working on an open-source tool Veritensor: https://github.com/arsbr/Veritensor
The goal is to help teams secure the AI/ML supply chain as models, datasets, and tooling increasingly come from third parties.
What it currently does: -Detects malicious code, hidden payloads, and unsafe operations inside ML models (e.g. Pickle, PyTorch, Keras) using static analysis and a custom execution engine -Verifies model integrity and detects tampering or supply-chain risks -Scans datasets for data poisoning, anomalies, and potential PII leaks -Analyzes documents used in RAG pipelines (PDF, DOCX, PPTX) for prompt injection and embedded threats -Inspects Jupyter notebooks for unsafe code, secrets, and risky patterns -Signs container images using Sigstore Cosign -Integrates into CI/CD pipelines and ML validation workflows
This is an early-stage project and very much a work in progress. It does not aim to replace runtime sandboxing, isolation, or human review, and it's not intended to be a silver bullet.
I’m interested in feedback from people running ML systems in production: -What parts of the AI supply chain are you most concerned about today? -Are there checks or threat models you feel are missing here? -Which parts of this approach seem flawed, incomplete, or unlikely to work in production? -Would a tool like this be useful in your production workflows, or would it be hard to adopt in practice? -Any suggestions on how to improve the project or make it more practical for real-world use would be really appreciated.
Thanks for you time!