On the issue of “are LLMs good at lisp” I have a bit of a tangential response/observation.
I saw this [paper](https://ai.meta.com/research/publications/logic-py-bridging-...) awhile ago. Long story short they made a python looking DSL for LLMs to convert natural language logic puzzles to. Then they converted the DSL expression to something a SAT/SMT solver could munch on.
My initial reaction was “why don’t they just have the LLM write smtlib2.” And I guess the answer is that LLMs are probably better at writing python-looking smtlib2. Probably an oversimplification of their work on my part. But I didn’t see any comparison between their work and a direct encoding into smtlib.
Makes me wonder if your idea could work along similar lines. Instead of using lisp directly, could you use a DSL that looks like more traditional languages? Would that help?