> Pre-processed relationships - Dependency graphs are explicit (graph.edges) rather than requiring inference
I suspect this actually is the opposite. Injecting some extra, non-standard format or syntax for expressing something requires more cycles for the LLM to understand. They have seen a lot of Typescript, so the inference overhead is minimal. This is similar to the difference between a Chess Grandmaster and a new player. The master or llm has specialized pathways dedicated to their domain (chess / typescript). A Grandmaster does not think about how pieces move (what does "graph.edges" mean?), they see the board in terms of space control. Operational and minor details have been conditioned into the low level pathways leaving more neurons free to work on higher level tasks and reasoning.
I don't have evals to prove one way or the other, but the research generally seems to suggest this pattern holds up, and it makes sense with how they are trained and the mathematics of it all.
Thoughts?
The claim I’m making is narrower: pre-processed structure isn’t about helping the model understand syntax, it’s about removing the need to re-infer relationships every time. The output isn’t a novel language - it’s a minimal, explicit representation of facts (e.g. dependencies, exports, routes) that would otherwise be reconstructed from source.
Inference works well per session, but it doesn’t give you a persistent artifact you can diff, validate, or assert against in CI. LogicStamp trades some inference convenience for explicitness and repeatability across runs.
I don’t claim one dominates the other universally - they optimize for different failure modes.
This is definitely something I don't see the value in. Why do I need to diff or look at this file in CI, it should be auto generated from the code, yea?
Why are git diffs insufficient? Models are trained on them and understand them very well. Google has some interesting research on having the models generate patches instead of code syntax directly.
I'm really not sure what pain or problem is being addressed here. Are there evals to show this concept improves overall model performance?
And to be honest, it's going to take a lot to convince me this is helpful when I can put a curated set of this same information in my AGENTS.md files I have all over my code base
> repeatability across runs
I definitely don't see how this holds up, or don't know what you mean. Are you implying your method creates repeatability if I run the same agent/LLM twice on the same input?
Git diffs + LLM inference work well for understanding changes once. What I’m targeting is reducing the need to re-infer semantic surface changes every run, especially across large refactors or long-running workflows.
Today, LogicStamp derives deterministic semantic contracts and hashes, and watch mode surfaces explicit change events. The direction this enables is treating those derived facts as a semantic baseline (e.g. drift detection / CI assertions) instead of relying on repeated inference from raw diffs.
By “repeatability” I mean the artifacts, not agent behavior: same repo state + config ⇒ same semantic model. I don’t yet have end-to-end agent performance evals versus AGENTS.md + LSP
> Inference works well per session ... doesn't give artifact ... explicitness and repeatability across runs.
When you write this, it sounds like you are talking about repeatability between inference sessions and that this artifact enables that. It does not read that you are applying the repeatability to the artifact itself, which one assumes since it is autogenerated from code via AST walking
By “repeatability” I mean the extraction itself: given the same repo state + config, the derived semantic artifact is identical every time. That gives CI and agents a stable reference point, but it doesn’t make agent behavior deterministic.
The value is in not having to re-infer structure from raw source each run - not in making inference runs repeatable.
I put this information in my AGENTS.md, for similar goals. Why might I prefer this option you are presenting instead? It seems like it ensures all code parts are referenced in a JSON object, but I heavily filter those down because most are unimportant. It does not seem like I can do that here, which makes me thing this would be less token efficient than the AGENTS.md files I already have. Also, JSON syntax eats up tokens with the quotes, commas, and curlies
Another alternative to this, give your agents access to LSP servers so they can decide what to query. You should address this in the readme as well
How is it deterministic? I searched the term in the readme and only found claims, no explanation
AGENTS.md and LogicStamp aren’t mutually exclusive. AGENTS.md is great for manual, human-curated guidance. LogicStamp focuses on generated ground-truth contracts derived from the AST, which makes them diffable, CI-verifiable, and resistant to drift.
On token usage: the output is split into per-folder bundles, so you can feed only the slices you care about (or post-filter to exported symbols / public APIs). JSON adds some overhead, but the trade-off is reliable machine selectability and deterministic diffs.
Determinism here means: same repo state + config ⇒ identical bundle output.
I don't really think dumping all this unnecessary information into the context is a good idea
1. search tools like an LSP are far superior, well established, and zero maintenance
2. it pollutes context with irrelevant information because most of the time you don't need to know all the details you are putting in there, especially the breadth, which is really the main issue I see here. No control over breadth or what is or is not included, so mostly noise for any given session, even with the folder separation. You would need to provide evals for outcomes, not minimizing token usage, because that is the wrong thing to primary your optimizations on
The problem it targets is different: producing stable, explicit structure (public APIs, components, routes) that can be diffed, validated, and reasoned about across runs - e.g. in CI or long-running agents. LSPs are query-oriented and ephemeral; they don’t give you a persistent artifact to assert against.
On breadth/noise: the intent isn’t to dump everything into one prompt. Output is sliced (per-folder / per-contract), and the assumption is that only relevant bundles are selected. Token minimization isn’t the primary goal; predictability and selectability are.
In practice I see them as complementary: LSPs for live search, generated contracts for ground truth. If your workflow is already LSP-driven, LogicStamp may simply not add much value - and that’s fine.
LogicStamp treats context as deterministic output derived from the codebase, not a mutable agent-side model.
When code changes mid-session, watch mode regenerates the affected bundles, and the agent consumes the latest output. This avoids desync by relying on regeneration rather than keeping long-lived agent state in sync.
How does this work in practice? How does the agent "consume" (reread) the files, with a tool call it has to decide to invoke?
When stamp context --watch is active, the MCP server detects it. The agent first calls logicstamp_watch_status to see whether context is being kept fresh.
If watch mode is active, the agent can directly call list_bundles(projectPath) → read_bundle(projectPath) and will always read the latest regenerated output. No snapshot refresh is needed.
If watch mode isn’t active, the workflow falls back to refresh_snapshot → list_bundles → read_bundle.
So “consume” just means reading deterministic files via MCP tools, with watch mode ensuring those files stay up to date.
This is in fact the very reason I set out to build my own agent, because Copilot does this with their `.vscode/instruction/...` files and the globs for file matching therein. It was in fact, not deterministic like I wanted.
My approach is to look at the files the agent has read/written and if there is an AGENTS.md in that or parent dirs, I put it in the system prompt. The agent doesn't try to read them, which saves a ton on token usage. You can save 50% on tokens per message, yet my method will still use fewer over the course of a session because I don't have to make all those extra tool calls
LogicStamp’s determinism claim is about the generated context: same repo state + config ⇒ identical bundles. That property holds regardless of how or when an agent chooses to read them.
Tool calls don’t make the artifacts non-deterministic; they only affect when an agent consumes already-deterministic output.
Can you address the token inefficiencies from having to make more tool calls with this method?
Token-wise, the intent isn’t “dump everything”. it’s selective reads of the smallest relevant bundles. If your workflow already achieves what you want with AGENTS.md + LSP querying, that may indeed be more token-efficient for many sessions.
The trade-off LogicStamp is aiming for is different: verifiable, diffable ground-truth artifacts (CI/drift detection/cross-run guarantees). Tokens aren’t the primary optimization axis.
This seems more similar in spirit to AGENTS.md than LSP, so I'll make the comparison there. Today, I require zero tool calls to bring my AGENTS.md into context, so this would require me making more tool calls, each of is a round trip to the LLM with the current context. So if I have a 30k context right now, and you are saying 1-2 calls per task, that is 30-60k extra tokens I need to pay for, for every one of these AGENT.md files that needs to be read / checked to see if in sync.
I use git for the verifiable / diffable ground truth artifacts. I can have my LSP query at different commits, there is no rule it can only access the current state of the code
The AGENTS.md comparison isn’t “same thing” - it’s a different layer. AGENTS.md encodes human intent/heuristics. LogicStamp generates semantic ground-truth contracts (exports, APIs, routes) from the AST so they can be diffed and validated mechanically (e.g. CI drift detection).
Git + LSP can diff/query source across commits, but that’s still a query workflow. LogicStamp’s goal is a persistent, versioned semantic artifact. If your workflow already covers that, then it may simply not add value - which is totally fine.
I still have a hard time seeing why I want something like this in my agentic or organic software development. I tried something nearly identical for Go, and having all that extra bookkeeping in context wrecked things, so on an experiential level, the custom DSL for giving the agent an index to the code base hurt overall coding agent performance and effectiveness.
What works far better is having very similar in content, but very curated "table of contents" for the agent. Yes, I also use the same methods to break it down by directory or other variables. But when it reads one of these, the majority is still noise, which is why overall performance degraded and why curation is such a difference maker.
Do you have evaluations for your project that it leads to better overall model capabilities, especially as they compare to a project that already uses AGENTS.md?
btw, I put this stuff in AGENTS.md now, you can put whatever you want in there, for example I auto generate some sections with Go's tooling to have a curated version of what your project does. I don't see it as a "different layer" because it is all context engineering in the end.