That is, is there any current LLM where you could prompt it with something like an interview question and it would give you perfect code one shot or zero shot?
It might be better to host an LLM and craft the questions such that you know that the LLM will screw it up. Or, since you're self-hosting, you can have the system prompt, say insert subtle bugs that aren't syntax errors. I'll have to test if that actually does anything.