Go is an excellent language for LLM code generation. There exists a large stable training corpus, one way to write it, one build system, one formatter, static typing, CSP concurrency that doesn't have C++ footguns.
The language hasn't had a breaking version in over a decade. There's minimal framework churn. When I advise teams to adopt agentic coding workflows at my consultancy [0], Go delivers highly consistent results via Claude and Codex regularly and more often than working with clients using TypeScript and/or Python.
When LLMs have to navigate Python and TypeScript there is a massive combinatorial space of frameworks, typing approaches, and utility libraries.
Too much optionality in the training distribution. The output is high entropy and doesn't converge. Python only dominated early AI coding because ML researchers write Python and trained on Python first. It was path dependence, not merit.\
The thing nobody wants to say is that the reason serious programmers historically hated Go is exactly why LLMs are great at it: There's a ceiling on abstraction.
Go has many many failings (e.g. it took over a decade to get generics). But LLMs don't care about expressiveness, they care about predictability. Go 1.26 just shipped a completely rewritten go fix built on the analysis framework that does AST-level refactoring automatically. That's huge for agentic coding because it keeps codebases modern without needing the latest language features in training data or wasting tokens looking up new signatures.
I spent four years building production public key infrastructure in Golang before LLMs [1]. After working coding agents like everyone else and domain-switching for clients - I've become more of a Go advocate because the language finally delivers on its promise. Engineers have a harder time complaining about the verbose and boilerplate syntax when an LLM does it correctly every single time.
It's an even more popular language with even more training data and also has a better type system so more validation on LLM output, etc.
Meanwhile Go already had a language change, while being less than half its age (loop variable capture).
Newer features fit very nicely and didn't increase the language surface (records are just a normal class with some methods auto-generated, while sealed types are just a restriction on who can subtype an interface -- and yet these give full ADT support for the language that improves readability and type safety).
I personally think neither Go nor Java would be good for "agents". Better to have them sandboxed in WASM.
Of course writing a language that compiles to Wasm is certainly a way, but you would have to sandbox also all the other tools that is used during development (e.g. agents can just call grep/find/etc).
It really felt like using AI tooling of a year or two ago. It wasn’t understanding my prompts, going on tangents, not following the existing style and idioms. Maybe Claude was hungover or doesn’t like mondays, but the contrast with Go was surprising.
One example is that I wanted to add an extra prometheus metric to keep track of an edge case in some for loop. All it had to do was define a counter and increment it. For some reason it would define the counter the line before increment it, instead of defining it next to the other counters outside of the for loop. Technically not wrong (defining a counter is idempotent), but who does that? Especially when the other counters are defined elsewhere in the same function?
Anyway, n=1 but I feel it has an easier time with Go.
My n=1 is that it is pretty good with Java, on par with other popular languages like Python and JS, in line with these 3 probably being a good chunk if not the majority of training data.
Do you think you might perhaps have a bias in the same way that my 9+ years of Typescript usage and advocacy would cause me to have a bias or a material interest?
There is nothing non-trivial you can make that involves the web that is better with Go than Typescript. I look at your personal page and I see that you're already struggling to manage state and css and navigation, or that those things aren't interesting to you.
This tells me you have limited web experience, just as I have limited experience making build scripts at Google and you would probably find my server-side concurrency fairly crude.
Still, you lump Python and Typescript together as "equally frustrating for LLMs" tells me you are not speaking out of direct experience. But the lumping in of Typescript and Python feels really, empirically wrong to me as someone with a foot in both those worlds.
> When LLMs have to navigate Python and TypeScript there is a massive combinatorial space of frameworks, typing approaches, and utility libraries.
I'm right there with you with Python! Lumping in static and dynamic languages is not correct here. Most Python code is from a fragmented ecosystem that took 10+ years to migrate from 2 to 3 and often there is no indication in the corpus even what major version it is and typing caught on very slowly. That's going to be a major problem for a long time, whereas no recent LLM has never ever ever confused .js for .ts or suddenly started writing Node .v12 and angular into a Node 22 and vue project.
I'm happy to throw down the gauntlet if you ever want to have a friendly go vs typescript vibe-code off that spans a reasonably sophisticated full-stack project over three or four hours of live coding.
If you feel like I'm a mean person and attacking you for wanting proof that Typescript is not at parity or superior to Go in terms of LLM legibility, I still would really like you to consider how you can demonstrate your virtuosity and value judgements best.
Python doesn’t need dependence to prove its merit. There’s a reason why it is one the major programming languages and was top 1 for a while.
I think this is true, but it misses a very key point. Go does an impressively bad job at designing APIs that are difficult to misuse, so LLMs will misuse them and will require also writing unit tests to walk through it, just to validate it used the libraries correctly. This isn't always possible (or is awkward/cumbersome) for certain scenarios like database querues.
All of the reasons people argue Go is good for LLMs are more true for Rust. You and the LLM can design libraries to be difficult to misuse, and then get instant feedback from the compiler to the LLM about what it did wrong, and often with suggestions about how it should fix them! This also makes RL deriving from compiler feedback more effective.
This allows the LLMs to reason more abstractly at larger scales, since the abstractions are less leaky (unlike in Go). The ceiling on abstraction screws you here, since troubleshooting requires more deep diving. It's the same reason Go projects become difficult for humans at large scales, too.
With Go, async code written in Go 1.0 compiles and runs the same in Go 1.26, and there is no fragmentation or necessity to reach for third party components.
Setting aside the problems of wrong but compiling code. Wrong and non-compiling code is also much easier to deal with. For training an LLM, you have an objective fitness function to detect compilation errors.
For using an LLM, you can embed the LLM itself in a larger system that checks it's output and either re-rolls on errors, or invokes something to fix the errors.
I would say Rust is quite good for just letting something churn through compiler errors until it works, and then you're unlikely to get runtime errors.
I haven't tried Haskell, but I assume that's even better.
With other languages, whether it's TypeScript/Go/Python, even if you explicitly ask agents to write/run tests, after a while agents just forget to do that, unless they cause build failures. You have to constantly remind them to do that as the session goes. Never happens with Rust in my experience.
For many months now though, Claude is nearly consistent with both calling test and check/clippy. Perhaps this is due to my global memory file, not sure to be honest.
What i do know, is that i never use those hooks, i have them disabled atm. Why? Because the benefit is almost nonexistent as i mentioned, and the cost is at times, quite high. It means i cannot work on a project piecemeal, aka "only focus on this file, it will not compile and that's okay", and instead forces claude to make complete edits which may be harder to review. Worst of all, i have seen it get into a loop and be unable to exit. Eg a test fails and claude says "that failure is not due to my changes" or w/e, and it just does that.. forever, on loop. Burns 100% of the daily tokens pretty quick if unmonitored.
Fwiw i've not looked to see if there's an alternate way to write hooks. It might be worth having the hook only suggest, rather than forcing claude. Alternatively, maybe i could spawn a subagent to review if stopping claude makes sense.. hmm.
I am trying out building a toy language hosted on Haskell and it's been a nice combo - the toy language uses dependent typing for even more strictness, but simple regular syntax which is nicer for LLMs to use, and under the hood if you get into the interpreter you can use the full richness of Haskell with less safety guardrails of dependent typing. A bit like safe/unsafe Rust.
I haven't had this problem with Opus 4.5+ and Haskell. In fact, I get the opposite problem and often wish it was more capable of using abstractions.
- I can build SPAs with typescript and offload expensive operations to a rust implementation that targets wasm
- I can build a multi-platform bundled app with Tauri that uses TS for the frontend, rust for the main parts of the backend, and it can load a python sidecar for anything I need python for (ML stuff mainly)
- Haven't dived too much into games but bevy seems promising for making performant games without the overhead of using one of the big engines (first-class ECS is a big plus too)
It ended up solving the problem of wanting to use the best parts of all of these different languages without being stuck with the worst parts.
not born out by evidence. rust is bottom-mid tier on autocoderbenchmark. typescript is marginally bettee than js
shifting to compile time is not necessarily great, because the llm has to vibe its way through code in situ. if you have to have a compiler check your code it's already too late, and the llm does not havs your codebase in its weights, a fetch to read the types of your functions is context expensive since it's nonlocal.
If you're running good agentic AI it can read the compile errors just like a human and work to fix them until the build goes through.
The big take away is that you can "patch" llms and steer them to correct answers in less trained programming languages, allowing for superior performance. Might work here. Not a clue how to implement, but stuff to llm-to-doc and the like makes me hopeful
- Rust: nearly universally compiles and runs without fault.
- Python,JS: very often will run for some time and then crash
The reason I think is type safety and the richness of the compiler errors and warnings. Rust is absolutely king here.
Not wanting to disagree, I am sure with Rust, it would be even more stable.
Does one get paid well to post these advertisements for Rust?
I hope there aren't many of your type on here.
Not to mention it's one of the slowest compilation of recent languages if not the slowest (maybe Kotlin).
Everything is a trade-off.
A half-assed type system is helpful for people writing code by hand. Then you get things like the squiggly lines in your editor and automated refactoring tools, which are quite beneficial for productivity. However, when an LLM is writing code none of that matters. It doesn't care one bit if the failure reports comes from the compiler or the test suite. It is all the same to it.
Any side effect has to be performed inside `IO<T>` type, which means impure functions need to be marked as `IO<T>` return. And any function that tries to "execute" `IO<T>` side effect has to mark itself as returning `IO<T>` as well.
You basically compose a description of the side effects and pass this value representing those to the main handler which is special in that it can execute the side effects.
For the rest of the codebase this is simply an ordinary value you can pass on/store etc.
Lifetimes are a global property and LLMs are not particularly good at reasoning about them compared to local ones.
Most applications don't need low level memory control, so this complexity is better pushed to runtime.
There are lots of managed languages with good/even stronger type systems than Rust, paired with a good modern GC.
Huh? Lifetime analysis is a local analysis, same as any other kind of type checking. The semantics may have global implications, but exposing them locally is the whole point of having dedicated syntax for it.
That's what the compiler is doing.
The developer (or LLM) is supposed to do the global reasoning so that what they end up writing down makes semantic sense.
Sure, throwing a bunch of variants at it and see what sticks is certainly an approach, but "lifetimes check out" only proves that the resulting code will be memory safe, not that it actually makes sense.
I've been successful with each, I think there's positives and negatives to both, just wanted to mention that particular one that stands out as making it relatively more pleasant to work with.
Let's set aside the fact that Go is a garbage collected language while Rust is not for now...
Do you prefer to let LLM reason about lifetimes, or debugging subtle errors yourself at runtime, like what happens with C++?
People who are familiar with the C++ safety discussion understand that lifetimes are like types -- they are part of the code and are just as important as the real logic. You cannot be ambiguous about lifetimes yet be crystal clear about the program's intended behavior.
Of course there are types where this is not true (file handlers, connections, etc), and managed languages usually don't have as good features to deal with these as CPP/Rust (raii).
As a human I can just decide to write quality code (or not!), but LLMs don't understand when they're being lazy or stupid and so need to have that knowledge imposed on them by an external reviewer. Static analysis is cheap, and more importantly it's automatic. The alternative is to spend more time doing code review, but that's a bottleneck.
I suspect the providers started training specifically in it because it appeared proportionally much more in the actual LLM usage (obviously much less than more mainstream languages like Python or JavaScript, but I wouldn't be surprised if there was more LLM queries on Rust than on C, for demographic reasons).
Nowadays even small Qwens are decent at it in one-shot prompts, or at least much better than GPT-4 was.
It's actually rare to have to borrow something and keep the borrow in another object (is where lifetime happens), most (95% at least I'd say) of the time you borrow something and then drop the borrow, or move the thing.
I wouldn't use it for the galaxy brain libraries or explorations I like to do for my blog but for production Haskell Opus 4.5+ is really good. No other models have been effective for me.
- Rust code generates absolutely perfectly in Claude Code.
- Rust code will run without GC. You get that for free.
- Rust code has a low defect rate per LOC, at least measured by humans. Google gave a talk on this. The sum types + match and destructure make error handling ergonomic and more or less required by idiomatic code, which the LLM will generate.
I'd certainly pick Rust or Go over Python or TypeScript. I've had LLMs emit buggy dynamic code with type and parameter mismatches, but almost never statically typed code that fails to compile.
In this benchmark, models can correctly solve Rust problems 61% on first pass — A far cry from other languages such as C# (88%) or Elixir (a “buggy dynamic language”) where they perform best (97%).
I wonder why that is, it’s quite surprising. Obviously details of their benchmark design matter, but this study doesn’t support your claims.
It´s a weird-ass Forth-like but with a strong type system, contracts, native testing, fuzz testing, and a constraint solver for integer math backed by z3. Interpreter implemented in Elixir.
In about 150 commits, everything it has done has always worked without runtime errors, both the Elixir interpreter and the examples in the hallucinated language, some of them non-trivial for a week old language (json parser, DB backed TODO web app).
It´s a deranged experiment, but on the other hand seems to confirm that "compile" time analysis plus extensive testing facilities do help LLM agents a lot, even for a weird language that they have to write just from in-context reference.
Don´t click if you value your sanity, the only human generated thing there is the About blurb:
In particular the whole stack based thing looks questionable.
In fact the very first answer by Gemini proposed an APL-like encoding of the primitives for token saving, but when I started the implementation Claude Code pushed back on that, saying it would need to keep some sane semantics around the keywords to be able to understand the programs.
The very strict verification story seems more plausible, tracks with the rest of the comments here.
What has surprised me is that the language works at all, adding todo items to a web app written in a week old language felt a bit eery.
I have programmed about 3 Forth implementations by hand throughout the years for fun, but I have never been able to really program in it, because the stack wrangling confuses me enormously.
So for me anything vaguely complex is unreadable , but apparently not for the LLMs, which I find surprising. When I have interrogated them they say they like the lack of syntax more than the stack ops hamper them, but it might be just an hallucinated impression.
When they write Cairn I sometimes see stack related error messages scroll by, but they always correct them quickly before they stop.
- Strongly typed, including GADTs and various flavors of polymorphism, but not as inscrutable as Haskell
- (Mostly) pure functions, but multiple imperative/OO escape hatches
- The base language is surprisingly simple
- Very fast to build/test (the bytecode target, at least)
- Can target WASM/JS
- All code in a file is always evaluated in order, which means it has to be defined in order. Circular dependencies between functions or types have to be explicitly called out, or build fails.
I should add, it's also very fun to work with as a human! Finding refactors with pure code that's this readable is a real joy.
But I don't believe the effects are tracked in the type system yet, but that's on it way.
With Multicore OCaml we gained thread sanitizer support and a reasonable memory model. Combined they give you tools for reasoning about data races and finding them. https://ocaml.org/manual/5.3/tsan.html
Well if it's a choice between these 4, then sure. Not sure that really suffices to qualify Go as "the" best language for agents
“Why Elixir is the best language for AI” https://news.ycombinator.com/item?id=46900241
- for comparison of the arguments made
- features a bit more actual data than “intuitions” compared to OP
- interesting to think about in an agent context specifically is runtime introspection afforded by the BEAM (which, out of how it developed, has always been very important in that world) - the blog post has a few notes on that as well
Rust is great, but there's no need to manage memory manually if you don't need to.
So for general mainstream languages, that leaves ... Python. Sure, it's ok but Go has strong typing from the start, not bolted on with warts.
(I realized how incredibly subjective this comment turned out to be after I had written it. Apologies if I omitted or slighted your fave. This is pretty much how I see it).
I’m not sure about cargo audit specifically, but most other security advisories are package scoped and will warn if your code transitively references the package, regardless of which symbols your code uses.
On the other hand I think Rust is better by some margin. Type system is obviously a big gain but Rust is very fast moving. When API changes LLMs can't follow and it takes many tries to get it right so it kinda levels out. Code might compile but only on some god-forgotten crate version everybody (but LLM) forgot about.
From personal experience Haskell benefits the most. Not only it has more type system usage than Rust, but its APIs are moving on snail-like pace, which means it doesn't suffer from outdated Rust and code compilable will work just fine. Also I think that Haskell code in training sets is guaranteed to be safe because of language extension system.
But what makes Go useful is the fact that it compiles to an actual executable you can easily ship anywhere - and that is actually really good considering that the language itself is super easy to learn.
I've recently started building some A agent tools with it and so far the experience has been great:
https://github.com/pantalk/pantalk https://github.com/mcpshim/mcpshim
https://bernste.in/writings/the-unreasonable-effectiveness-o...
I actually spent some time trying to get to the bottom of what a logical extension of this would be. An entirely made up language spec for an idealized language it never saw ever, and therefore had no bad examples of it. Go is likely the closest for the many reasons people call it boring.
I expect rust to gain some market share since it's safe and fast, with a better type system, but complex enough that many developers would struggle by themselves. But IME AI also struggles with the manual memory management currently in large projects and can end up hacking things that "work" but end up even slower than GC. So I think the ecosystem will grow, but even once AI masters it, the time and tokens required for planning, building, testing will always exceed that of a GC language, so I don't see it ever usurping go, at least not in the next decade.
I wish the winner would be OCaml, as it's got the type safety of rust (or better), and the development speed of Go. But for whatever reason it never became that mainstream, and the lack of libraries and training data will probably relegate it to the dustbin. Basically, training data and libraries >>> operational characteristics >>> language semantics in the AI world.
I have a hard time imagining any other language maintaining a solid advantage over those two. There's less need for a managed runtime, definitely no need for an interpreted language, so I imagine Java and Python will slowly start to be replaced. Also I have to imagine C/C++ will be horrible for AI for obvious reasons. Of course JS will still be required for web, Swift for iOS, etc., but for mainstream development I think it's going to be Rust and Go.
Syntax. Syntax is the reason. It's too foreign to be picked up quickly by the mass of developers that already know a C style language. I would also argue that it's not only foreign, it's too clunky.
I've started what I'm calling an agent first framework written in Go.
Its just too easy to get great outputs with Go and Codex.
https://github.com/swetjen/virtuous
The key is blending human observability with agent ergonomics.
I've no idea myself, I just thought it was interesting for comparison.
https://news.ycombinator.com/item?id=47222705
Edit: cool article, I have myself speculated that we will get a new language made for/by llms that will be torture writing by hand/ide but easy to read/follow/navigate/check for a human and super easy for Llms to develop and maintain.
I've no idea myself, I just thought it was interesting for comparison.
But that's because it's tight, token efficient, and above all local. Pure functions don't require much context to reason about effectively.
However, you do miss the benefit of types, which are also good for LLMs.
The "ideal" LLM language would have the immutability and functional nature of Clojure combined with a solid type system.
Haskell or OCaml immediately come to mind, but I'm not sure how much the relative lack of training data hurts... curious if anyone has any experiences there.
Stack overflow tags:
17,775 Clojure
74,501 Go
I’m not finding a way to get any useful information from GitHub, e.g. count of de-duplicated lines of code per language. There might be something in their annual “Octoverse” report but I haven’t drilled into it yet: https://github.blog/news-insights/octoverse/octoverse-a-new-...- structurally edited, ensuring syntactic validity at all times
- annotated with metadata, so that agents can annotate the code as they go and refer back to accreted knoweledge (something Clojure can do structurally using nodepaths or annotations directly in code)
- put into any environment you might like, e.g. using ClojureScript
I haven't proven to myself this is more useful/results in better code than just writing code "the normal way" with an agent, but it sure seems interesting.
Golang just gets bogged down in irrelevant details way too easily for this.
May be this is good incentive to improve error handling in Go.
Though, I have found both to be better at C# than Swift, for example.
I really love this point-out. Not always an easy sell upstream, but a big factor in happy + productive teams.
The most important downside of Python is that it doesn't compile to a native binary that the OS can recognize and it's much slower. However, it's a great "glue" for different binaries or languages like Rust and Go.
Rust is the increasingly popular language for AI agents to choose from, often integrated into Python code. The trend is on the side of Rust here. I don't want to mention all the great points from the original poster. One technical point that wasn't mentioned, from my experience, is that the install size is too large for embedded systems. As the article mentioned, the build times are also longer than Go and this is an even worse bottleneck on embedded systems. I prefer Go over Rust in my research and development but I yield to other developers on the team professionally.
What about C/C++? At the moment, I've had great success with implementing C++ code through Agentic AI. However, there are a dearth of frameworks for things like web development. Because Python compiles to C, and integrating C modules into Python is relatively straightforward, I find myself implementing the Numpy approach where C is the backbone of performance critical features.
Personally, I still actively utilize code I've written more than 10 years ago that's battle tested, peer reviewed, and production ready. The above comments are for the current state, but what about the future? Another point that wasn't mentioned was the software license from Go. It's BSD3 with a patent grant which is more permissive than Rust's MIT + Apache 2.0 licenses. This is very important to understand the future viability of software because given enough time and all other things the same, more permissive software will win out in adoption.
The rabbit hole goes deeper. I think we will sacrifice Rust as the "good-enough" programming language to spoil the ecosystem with Agentic AI before its redemption arc. Only time will tell, but Python's inability to compile to a native binary makes it a bad choice for malware developers. You can fill in the blank here. Perhaps the stage has already been set, and it looks like Rust will be the opening act now that the lights are on.
On the other hands if there good conventions it’s also a benefit, for example Ruby on Rails.
As a human programmer with creative and aesthetic urges as well as being lazy and having an ego, I love expressive languages that let me describe what I want in a parsimonious fashion. ie As few lines of code as possible and no boilerplate.
With the advances in agent coding none of these concerns matter any more.
What matters most is can easily look at the code and understand the intent clearly. That the agent doesn't get distracted by formatting. That the code is relatively memory safe, type safe and avoids null issues and cannot ignore errors.
I dislike Go but I am a lot more likely to use it in this new world.
But for how long will it matter? I do wonder if programming languages as we know them today will lose relevance as all this evolves.
- I agree that go's syntax and concepts are simpler (esp when you write libraries, some rust code can get gnarly and take a lot of brain cycles to parse everything)
- > idiomatic way of writing code and simpler to understand for humans - eh, to some extent. I personally hate go's boilerplate of "if err != nil" but that's mainly my problem.
- compiles faster, no question about it
- more go code out there allowing models to generate better code in Go than Rust - eh, here I somewhat disagree. The quality of the code matters as well. That's why a lot of early python code was so bad. There just is so much bad python out there. I would say that code quality and correctness matters as well, and I'd bet there's more "production ready" (heh) rust code out there than go code.
- (go) it is an opinionated language - so is rust, in a lot of ways. There are a lot of things that make writing really bad rust code pretty hard. And you get lots of protections for foot meets gun type of situations. AFAIK in go you can still write locking code using channels. I don't think you can do that in rust.
- something I didn't see mentioned is error messages. I think rust errors are some of the best in the industry, and they are sooo useful to LLMs (I've noticed this ever since coding with gpt4 era models!)
I guess we'll have to wait and see. There will be a lot of code written by agents going forward, we'll be spoiled for choice.
But it does have the benefit of having a very strong "blessed way of doing things", so agents go off the rails less, and if claude is writing the code and endless "if err != nil" then the syntax bothers me less.
Code is free, sure, but it's not guaranteed to be correct, and review time is not free.
... write the code yourself?
I think many many people just skip the "review" step in this process, and assume they're saving time. It's not going to end well.
Reduce entropy, increase probability of the correct outcome.
LLMs are surfing higher dimensional vector spaces, reduce the vector space, get better results.
With Go it will increasingly become that one has to write the design doc carefully with constraints, for semi tech/coder folks it does make a lot of sense.
With Python, making believe is easy(seen it multiple times myself), but do you think that coding agent/LLM has to be quite malicious to put make believe logic in compile time lang compared with interpreted languages?
---
# Author likes go
Ok, cool story bro...
# Go is compiled
Nice, but Python also has syntax and type checking -- I don't typically have any more luck generating more strictly typed code with agents.
# Go is simple
Sure. Python for a long time had a reputation as "pseudocode that runs", so the arguments about go being easy to read might be bias on the part of the author (see point 1).
# Go is opinionated
Sure. Python also has standards for formatting code, running tests (https://docs.python.org/3/library/unittest.html), and has no need for building binaries.
# Building cross-platform Go binaries is trivial
Is that a big deal if you don't need to build binaries at all?
# Agents know Go
Agents seem to know python as well...
---
Author seems to fall short of supporting the claim that Go is better than any other language by any margin, mostly relying on the biases they have that Go is a superior language in general than, say, Python. There are arguments to be made about compiled versus interpreted, for example, but if you don't accept that Go is the best language of them all for every purpose, the argument falls flat.
1) Go runs faster, so if you're not optimizing for dev time (and if you're vibe coding, you're not) then it's a clear winner there
2) Python's barrier to entry is incredibly low, so intuitively there's likely a ton of really terrible python code in the training corpus for these tools