> some extra attributes about which model and agent was used.
> You can re-run any commit against a fresh checkout to see
> what Claude generates from the same instruction.
I don't see how this is true. LLMs can generate different outputs even with the same model and inputs.
> Add users with authentication
> No, not like that
> Closer, but I don’t want avatars
> I need email validation too
> Use something off the shelf?
Someone in this place was saying this the other day: a lot of what might seem like public commits to main are really more like private commits to your feature branch. Once everything works you squash it all down to a final version ready for review and to commit to main.It’s unclear what the “squash” process is for “make me a foo” + “no not like that”.
For any formal language, there was a testing and iteration process that resulted in the programmer verifying that this code results in the correct functionality, and because a formal compiler is deterministic, they can know that the same code will have the same functionality when compiled and ran by someone else (edge cases concerning different platforms and compilers not withstanding)
But here, even if the prompt is iterated on and the prompter verifies dlfunctionality, its not guaranteed (or even highly likely) to create the same code with the same functionality when someone else runs the prompt. Even if the coding agent is the same. Even if its the same version. Simply due to the stochastic nature of these things.
This sounds like a bad idea. You gotta freeze your program in a reproducable form. Natural language prompts arent it, formal language instructions are
dependencies = [“foo”]
While the code itself is more like: $ wc -l uv.lock
245
You need both.While I haven't used Claude long enough to need my prompts, I would appreciate seeing my coworkers' prompts when I review their LLM-generated code or proposals. Sometimes it's hard to tell if something was intentional that the author can stand behind, or fluff hallucinated by the LLM. It's a bit annoying to ask why something suspicious was written the way it is, and then they go ahead and wordlessly change it as if it's their first time seeing the code too.
> Every ghost commit answers: what did I want to happen here? Not what bytes changed.
Aren't they just describing what commit messages are supposed to be? Their first `git log --online` output looks normal to me. You don't put the bytes changed in the commit message; git can calculate that from any two states of the tree. You summarize what you're trying to do and why. If you run `git log -p` or `git show`, then yeah you see the bytes changed, in addition to the commit message. Why would you put the commit messages in some separate git repo or storage system?
> Ghost snapshots the working tree before and after Claude runs, diffs the two, and stages only what changed. Unrelated files are never touched.
That's...just what git does? It's not even possible to stage a file that hasn't changed.
> Every commit is reproducible. The prompt is preserved exactly. You can re-run any commit against a fresh checkout to see what Claude generates from the same instruction.
This is not what I mean by reproducible. I can re-run any commit against a fresh checkout but Claude will do something different than what it did when they ran it before.
This naturally happens over several commits. I suppose I could squash them, but I haven't bothered.
I’m wondering how deep you plan to go on environment pinning beyond that. Is the system prompt / agent configuration versioned? Do you record tool versions or surrounding runtime context?
My mental model is that reproducible intent requires capturing the full "execution envelope", not just the human prompt + model & agent names. Otherwise it becomes more of an audit trail (which is also a good feature) than something you can deterministically re-run.
Curious how you’re thinking about that.
I thought,that was what commit titles and descriptions are for?
What changed would be the diff.
Edit: perhaps the prompt describes the "how".