In the case that interfaces remain unchanged, agents only need to look at the implementation of a single module at a time plus the interfaces it consumes and implements. And when changing interfaces, agents only need to look at the interfaces of the modules concerned, and at most a limited number of implementation considerations.
It’s the very reason why we humans invented modularization: so that we don’t have to hold the complete codebase in our heads (“context windows”) in order to reason about it and make changes to it in a robust and well-grounded way.
https://www.slater.dev/2026/02/relieve-your-context-anxiety-...
From a purely UX perspective, showing a red badge seems you’re conflating “less good” with size. Who is the target for this? Lots of useful codebases are large.
I do agree, however, that there’s value in splitting up domains into something a human can easily learn and keep in their head after, say, a few days of being deeply entrenched. Tokens could actually be a good proxy for this.
Agents. Going to be more tools and software targeted for consumption by agents
Just spawn the agent in one of the subprojects
For example in my current case, there are lots of files with CSS, SVG icons in separate files, old database migration scripts, etc. Those don't go in the LLM context 99% of the time.
Maybe a more useful metric would be "what percentage of files that have been edited in the last {n} days fit in the context"?
Scoping the Ai to only use the things you'd use seems far wiser than trying to reduce your codebase so it can look at the whole thing when 90% of it is irrelevant.
It is somewhat ironic that coding agents are notorious for generating much more code than necesary!
But my coolest app was a better context creator. I found it hard to extend to actual agentic coding use. Agentic discovery is generally useful and reliable - the overhead of tokens can be managed by the harness (i.e. Claude Code).
It would be better to have the architecture support a more decoupled/modular design if you're going to rely heavy on LLMs.
That or let it consume high quality maintained documentation?
I think this gestures at a more general point - we're still focusing on how to integrate LLMs into existing dev tooling paradigms. We squeeze LLMs into IDEs for human dev ergonomics but we should start thinking about LLM dev ergonomics - what idioms and design patterns make software development easiest for AIs?
> I think this gestures at a more general point - we're still focusing on how to integrate LLMs into existing dev tooling paradigms.
This is what we should be doing. This for a couple reasons. For one thing, humans don't have an entire codebase "in context" at a time. We should be recognizing that the limitations of an AI mirror the limitations of a person, and hence can have similar solutions. For another, the limitations of today's LLMs will not be the limitations of tomorrow's LLMs. Redesigning our code to suit today's limitations will only cause us trouble down the road.
I am not very good with AI though. Is there a quick and easy way to calculate token count and add this to my dump.txt file, ideally using just simple, included by default Linux tools in bash or simple, included by default Windows tools in powershell?
Thank you in advance.
Doubt me?
Think back 2 years. Now compare today. Change is at massive speed, and this issue is top line to be resolved in some fashion.
If we look at back 2 years, companies weren't investing into training their LLMs so heavily on code. Any code they got their hands on was what was in the LLMs training corpus, it's well known that the most recent improvements in LLM productivity occurred after they spent millions on different labs to produce more coding datasets for them.
So while LLMs have gotten a lot better at not needing the entire codebase in context at once, because their weights are already so well tuned to development environments they can better infer and index things as needed. However, I fail to see how the context window limitation would no longer be an issue since it's a fundamental part of the real world. Would we get better and more efficient ways of splitting and indexing context windows? Surely. Will that reduce our fear of soiling our contexts with bad prompt response cycles? Probably not...
Also kind of ironic that small codebases are now in vogue, just when google monolithic repos were so popular.
It depends on the provider/model, usually pricing is calculated as $/million tokens with input/output tokens having different per token pricing (output tends to be more expensive than input). Some models also charge more per token if the context size is above a threshold. Cached operations may also reduce the price per token.
OpenRouter has a good overview over provider and models, https://openrouter.ai/models
The math on what people are actually paying is hard to evaluate. Ime, most companies rather buy a subscription than give their developers API keys (as it makes spending predictable).
Are there companies out there that add token counts to ticket “costs”, i.e. are story points being replaced/augmented by token counts?
Or even worse, an exchange rate of story points to tokens used…
The downside with subscriptions is that your work with the LLM will grind to a halt for a number of hours if you hit the token limit. I was doing what I consider very trivial work adding Javadoc comments to a few dozen files using Claude Sonnet on the $20 plan and within 30 minutes had been told to sit out for a couple hours. The reason was that Claude was apparently repeatedly sending the files up and down to fill in the comments. In hindsight, sure, that's obvious, but you would think that Claude would be smart enough to do some sort of summarization to make things more efficient. Looking into it, it was on the order of several million tokens in a very short amount of time.
It really made me wonder how in the hell people are using Claude to do "real" work, but I've heard of people having multiple $200/month subscriptions, so I guess that could work. Definitely seems like a glimpse into the future of what these services will truly cost once people are hooked on them.
So Claude can understand the codebase, it needs to document it. Makes sense and is also great for humans because now there is uptodate docu on the codebase.
I don’t know how much it cost but the codebase, I’m told, is around 2 to 3 million lines of code.
Still, this seems useful for being able to see at a glance. I have no idea where most of my own projects would land.