phi-redactor is a reverse proxy that sits between your app and OpenAI/Anthropic, automatically detecting and masking all 18 HIPAA Safe Harbor identifiers before they leave your network.
How it works: - Request hits the proxy → Presidio + spaCy detect PHI (names, SSNs, MRNs, dates, etc.) - PHI replaced with clinically coherent fakes (John Smith → Robert Chen, not [REDACTED]) - Sanitized request forwarded to the LLM - Response comes back → synthetic tokens rehydrated to originals - Your app sees the real data, the LLM never does
Technical details: - 10 custom Presidio recognizers including FHIR/HL7v2 support - Fernet-encrypted SQLite vault for mapping persistence - SSE streaming with buffer-based rehydration - Tamper-evident hash-chain audit trail - 256 tests passing
Apache-2.0. Feedback welcome — especially from anyone doing healthcare AI in production.