I built pg-obfuscate to solve this specific problem.
It’s a CLI tool that: - Connects directly to Postgres - Obfuscates selected tables/columns based on a YAML config - Uses deterministic rules so relationships and shapes are preserved - Supports dry-run vs execute modes - Is designed for safely sharing production-like datasets across environments
Example use case: - Share a realistic dataset with contractors - Reproduce bugs locally without leaking real data - Sanitize a database before exporting it
It’s Postgres-only for now and intentionally narrow in scope.
The project is open source under AGPLv3+, with a commercial license available for companies that can’t use AGPL.
Repo: https://github.com/Ofsen/pg-obfuscate
I’m mainly looking for feedback on: - Safety assumptions - Edge cases I might be missing - Whether this overlaps with existing tools I overlooked
Thank you