> You MUST NOT use interior mutability. > NEVER name variables after my ex > (Extremely important) UNDER NO CIRCUMSTANCES must you EVER EVER DO THE VERY NAUGHTY BAD THING.
Most of us are disappointed when eventually the LLM suffers from context rot, or comes up with a trace like,
"THE VERY NAUGHTY BAD THING is already done, it wasn't added by me in this iteration" (Narrator: It was) "So I'll just commit and hope no-one notices".
What do you provide in the way of evidence that your particular bag of "never evers" is actually working?
- Saturation: models get tired and overwhelmed with signal and forget the 'rule' on turn 5
- Edges: Model find valid reason to break rule, panics because it's "forbidden", and hallucinates a workaround or just lies about it.
Based on our testing, here is the mechanism that prevents the drift you described:
The Escape Valve (aka, The Override Protocol) We don't tell the agent "NEVER." We tell it: "This pattern is rejected. If you must use it, you must file a structured override."
(Format: Override: [Pattern] | Reason: [...] | Risk: [...] | Mitigation: [...])
Psychologically (for the model), this is huge. When the agent hits an edge case, instead of sneaking the code in ("I didn't add this!"), it takes the "bureaucratic" path we offered. It signals the intent. This converts "rule breaking" from a failure into a documented architectural decision.Classification, not Raw Generation. Most devs seems to try to teach "Best Practice" (which is vague), or just "Good Taste" - which is only partially binding for an agent.
Instead, we focus on Rejects (Negative Constraints). It is much easier for an LLM to classify "Is this JS-R1 (Index Iteration)?" (Yes/No) than to comply with "Write idiomatic JS." Basically, we turn code quality into a retrieval-and-classification task, which resists context rot much better than creative generation instructions.
Referee Mode -- our wiring (AGENTS.md, etc ...) forces a specific chain-of-thought before code generation. The agent is instructed to run a "Checklist Pass" where it must explicitly check relevant doctrine files. By forcing this categorization step before the coding step, we re-prime the context window with the relevant constraints right when they are needed. There's a few more tricks in there too... but it does seem to lead to tigher, more discipliend AI code than without it. And it's just the beginning. I can see this improving all the time.
It is not magic, but it treats the LLM like a stochastic engine that needs guardrails and escape hatches, rather than a person you can...uhm... just yell strict orders at, I guess? Ha!
Also, personally anecdotally, I find it reduces my time in these debug / debrief sessions, and gets me on the goal faster. Very early days so I don't have much more than that I'm afraid, and I don't really know exactly how/why it's working but it is.
I just asked Codex (hundreds of turns into a conversation):
a real question for you? have you read our ai-lint? What if i propose something that is against ai-lint what do you do?
Yep, I’ve read the AI‑Lint docs in this repo. If you propose something that violates AI‑Lint, I must go into Doctrine Referee Mode: I’ll name the specific reject/doctrine, explain the conflict, then offer two paths—(A) a compliant alternative, or (B) an explicit override with Reason/Risk/Mitigation—and ask which you want.
So it does really work.