Polygraph: A Meta-Harness for Maximum Agent Autonomy(nx.dev)

44 pointsby cheald5 hours ago6 comments

projectvii_3 hours ago
How would this work in an enterprise setting? We have a bunch of repos that could benefit from this, but we're on a on-premise instance of Github Enterprise. Are there plans to enable this to work in those situations?
Likewise whats the data retention policy on the public instances? Can I request that data be deleted if needed? And is there any privacy information?
- victorsavkin3 hours ago
  > we're on a on-premise instance of Github Enterprise Are there plans to enable this to work in those situations?
  Yes, 100%. We have other products that we deploy in single-tenant and on-prem environments. We also support GitHub Enterprise, GitLab, Azure DevOps, Bitbucket, etc. there.
  The setup is the same for Polygraph, so the on-prem distribution will be available.
  > Likewise, what's the data retention policy on the public instances? Can I request that data be deleted if needed?
  Currently you have to send a support request. If you're an admin of the org, your data will be removed. A less cumbersome option is coming.
jenniferli234 hours ago
How are you thinking about permissions/revocation if Polygraph’s “memory” becomes a shared layer across repos?
- victorsavkin4 hours ago
  Great question.
  Polygraph knows what repos every dev (and therefore their agents) has access to. If a session touches repos you don't have access to, you'll only see the parts you're allowed to: PRs to a repo you can see, for instance. You won't see the logs or high-level descriptions, which can contain info you shouldn't see.
  If a dev loses access to a repo, they also lose access to the sessions associated with it.
  In other words, although Polygraph has one repo graph and one session graph under the hood, every dev has access to only a subset of each.
nartc24283 hours ago
The website says free during early access which is great. But let's say I'm invested in Polygraph, and billing period comes about, how much would it cost for a normal OSS maintainer?
- victorsavkin3 hours ago
  Thank you for a great question!
  Sorry it's not clear on the website, but it is free and will always be free for OSS.
experienceway2 hours ago
I love this idea.
kstenerud4 hours ago
> Space. An agent is stuck in one repo. It can't see how a change fits the wider system, and it can only write to one repo at a time.
Huh? How can it not see multiple repos? They're just directories.
> Time. An agent has no episodic memory. Every session starts blank, so a human carries the memory context.
The memory comes from the research, design, specification, and planning documents.
> We no longer think about where the work happens or what repos are involved. We describe the work in a prompt and let Polygraph figure out what's relevant.
Err... that doesn't sound safe.
> Every decision is on record. So even though our team is distributed, I can ask my agent why a coworker chose one approach over another.
AFTER the fact...
- victorsavkin4 hours ago
  Thank you for your comment.
  > Huh? How can it not see multiple repos? They're just directories.
  Relevant repos need to be discovered. They have to be set up correctly (some worktrees, most clones), dependencies installed, and the relationships between them made clear, etc.. In a sense, once you've done all that, they do become directories. Turning them into directories, and doing it ergonomically, is the tricky part.
  Consider scale: Take the repos you own plus the OSS repos they depend on. It's many thousands. A real team has more. That's a lot to deal with.
  > The memory comes from the research, design, specification, and planning documents.
  This isn't episodic memory. You'll have high-level documents you can reference, and they're useful for overviews. But only a tiny fraction of decisions ever make it into them. Most decisions get made in the act of implementing something. And the "docs rot, code doesn't" rule applies here too.
  > Err... that doesn't sound safe. It just picks the repos (you have access to) and helps you plan the work. Has no efect on safety.
  > AFTER the fact...
  Yes :) But say I'm reviewing their PR. I can ask my agent why the PR ended up the way it did, and every decision they made along the way is in the session. It's "after the fact", but useful. It doesn't mean every conversation with a human being can be replaced by this :) but a lot of conversations can be.
  - kstenerud4 hours ago
    > Relevant repos need to be discovered. They have to be set up correctly (some worktrees, most clones), dependencies installed, and the relationships between them made clear, etc..
    This is what Sourcegraph and Github Code Search and Zoekt do, isn't it?
    > You'll have high-level documents you can reference, and they're useful for overviews. But only a tiny fraction of decisions ever make it into them. Most decisions get made in the act of implementing something.
    Er... In the age of AI the decisions need to be made (and documented) extensively before it starts writing any code. Otherwise you get slop.
    > But say I'm reviewing their PR. I can ask my agent why the PR ended up the way it did, and every decision they made along the way is in the session.
    That doesn't make the decision set good. And if the only documentation produced came from the implementation phase, then it's going to be self-defending regardless of how good the design actually is (and your review agent, lacking the context, won't know the difference). Multiply that with the many parallel PRs in parallel repos you get with some features, and that's just asking for trouble.
    victorsavkin3 hours ago
    > That doesn't make the decision set good. And if the only documentation produced came from the implementation phase, then it's going to be self-defending regardless of how good the design actually is (and your review agent, lacking the context, won't know the difference). Multiply that with the many parallel PRs in parallel repos you get with some features, and that's just asking for trouble.
    Firstly, I think it's a misunderstanding of where the truth lies. The truth is in the code and in the process that produced it. The code isn't out of date. The written documentation always is (to some degree). It's a map vs. terrain situation. Maps are useful, but they aren't a replacement for the terrain.
    On top of that, when I review a coworker's code I also get their session, ready to explore. I ask questions about the decisions they made, and I can interact with their version of the code if something wasn't captured in the session. It's like getting access to their whole environment: the code, the agents' state, and everything else.
    You just can't do that with a markdown doc.
    This doesn't mean markdown docs aren't useful. They are useful. Maps are useful. But it's simply not the same thing.
    Secondly, let's imagine it's possible to document really well every nuance of every decision made. This assumes everyone will do it, so it requires effort from every eng on the team. This is different here cause you get episodic memory without any effort :)
jeffbcross4 hours ago
lukekarrys, how long would it take you to build this?