The high level declarative nature and type driven development style of languages like Haskell also make it really easy for an experienced developer to review and validate the output of the LLM.
Early on in the GPT era I had really bad experiences generating Haskell code with LLMs but I think that the combination of improved models, increased context size, and agentic tooling has allowed LLMs to really take advantage of functional programming.
But it could be that different programming languages are a bit like different human languages for these models: when they have more than some threshold of training data, they can express their general problem solving skills in any of them? And then it's down to how much the compiler and linters can yell at them.
For Rust, I regularly tell them to make `clippy::pedantic` happy (and tell me explicitly when they think that the best way to do that is via an explicit ignore annotation in the code to disable a certain warning for a specific line).
Pedantic clippy is usually too.. pedantic for humans, but it seems to work reasonably well with the agents. You can also add clippy::cargo which ain't included in clippy::pedantic.
I think this is exactly right.
But had never considered that a programming language might be created thats less human readable/auditable to enable LLMs.
Scares me a bit.
I am not sure token efficiency is an interesting problem in the long term, though.
And in the short term I wonder if prompts could be pre-compiled to “compressed tokens”; the idea would be to use a smaller number of tokens to represent a frequently needed concept; kind of like LZ compression. Or maybe token compression becomes a feature of future models optimized for specific tasks.
I was wondering last year if it would be worthwhile trying to create a language that was especially LLM-friendly, eg that embedded more context in the language structure. The idea is to make more of the program and the thinking behind it, explicit to the LLM but in a programming language style to eliminate the ambiguity of natural language (one could just use comments).
Then it occurred to me that with current LLM training methodology that there’s a chicken-and-egg problem; it doesn’t start to show rewards until there is a critical mass of good code in the language for LLMs to train on.
So I'm not convinced this is either the right metric, or even if you got the right metric that it's a metric you want to minimize.
But I would love for more expressive and compact languages to do better, selfish as I am. But I think training data size is more of a factor, and we won’t be all moving up Clojure any time soon.
Because that’s what happened in the real world when generating a bunch of untyped Python code.
E.g. when it comes to authoring code, C, which comes language, is by far one of the languages that LLMs excel most at.
Claude Code makes some efforts to reduce context size, but at the end of the day is loading entire source files into context (then keeping them there until told to remove them, or context is compacted). One of the major wins is to run subagents for some tasks, that use their own context rather than loading more into CCs own context.
Cursor makes more efficient use of context by building a vector database of code chunks, then only loading matching chunks into context (I believe it does this for Composer/agentic use as well as for tab/autocomplete).
One of the more obvious ways to reduce context use in a larger multi-module codebase would be to take advantage of the split between small module definition (e.g. C++ .h files) and large module implementations (.cpp files). Generally you'd only need to load module interfaces/definitions into context if you are working on code that uses the module, and Cursor's chunked approach can reduce that further.
For whole codebase overview a language server can help locate things, and one could use the AI to itself generate shortish summaries/overviews of source files and the codebase and structure, similar to what a human developer might keep in their head, rather than repeatedly reading entire source files for code that isn't actually being modified.
It seems we're really in the early days of agentic coding tools, and they have a lot of room to get better and more efficient.
If you're interested in learning more, https://github.com/sibyllinesoft/scribe