It makes sense to do replay before prevention. For an agent, the challenge is often not recognizing that a bad action occurred, but reconstructing the complete decision context that led to it.
I'm wondering how you handle replay if the retrieved context or external state has changed since the original run.