Show HN: Phantom – Open-source AI agent on its own VM that rewrites its config(github.com)

17 pointsby mcheemaa7 hours ago7 comments

jaboostin5 hours ago
My friends and I have been running a similar homegrown system on a VM at home: Claude Code in a GNU screen managed by systemd, Cloudflare tunnels, Graphiti memory system, a Discord channel plugged into Claude to drive it, and Temporal for all sorts of workflows and crons that it builds on its own.
It arrived at the same incredibly fun behavior as you talk about in the readme, where the agent just builds all sorts of junk for you autonomously. It has built dozens of web apps, static pages, mini games, etc. all tied back into a central domain that I gave it. I truly have no idea what the system or code looks like but it’s been so much fun just letting it build.
The “For People Who Don't Write Code” is so true as well. We have someone in discord that has never written code but they can ask the agent to build virtually anything, it goes off and churns, then pops back with a link to it running live. It’s honestly been so much fun with friends, highly recommend trying it out.
- hmokiguess3 hours ago
  > I truly have no idea what the system or code looks like
  Does it not concern you if it installed a compromised package, vulnerable exploit, or it has something exposed and leaking everything to an attacker?
  I understand that your personal account is removed from it, but still, it has a direct link to you, and an attacker could be just building up towards it stealthily to hit when the time is right, maybe it gains SSH into your VM or whatever
  - jaboostin2 hours ago
    eh I can nuke the VM and start fresh. Everything is in git anyway. As for sensitive data, it has its own accounts and no credit cards etc so the blast radius feels limited. I would say this is a fundamental impediment to being used in serious use-cases but for some friends messing around I’m not worried.
    It could have installed say, that vulnerable version of litellm, and the entire VM is compromised. But it’s on an isolated vlan anyway so the worst it can really do is use bandwidth and maybe hurt my IP reputation? I could move it to a cloud VM but the risks seem minimal at the moment. I’m definitely not advocating for no defense in depth, but npm install in an isolated VM feels safer than npm install on my work laptop these days :-)
    hmokiguessan hour ago
    > I would say this is a fundamental impediment to being used in serious use-cases
    Fair point, so it's really a fancy tamagotchi you got there I guess haha
- mcheemaa3 hours ago
  This is awesome to hear. The "I truly have no idea what the system or code looks like but it's been so much fun just letting it build" resonates hard. That's exactly the experience we had too.
  The "For People Who Don't Write Code" angle has been the biggest surprise for us. We had a non-technical user ask for a Chrome extension and the agent built it, packaged it as a zip, and sent the download link. No terminal, no dev environment needed.
  If you ever want to formalize your setup, we built Specter (https://github.com/ghostwright/specter) to provision VMs with DNS, TLS, and systemd in under 90 seconds. Makes spinning up new instances trivial. Would love to hear more about your Graphiti memory setup, that's a different approach than our Qdrant-based system.
  - jaboostin2 hours ago
    Graphiti is interesting because it’s ingesting episodes (discord chat messages), extracting facts and relationships, and then allows the agent to query that back, keeping the relationships in tact. So rather than a flat list of vectors related to a search term, the agent can essentially walk from one fact/concept to another. While plain vector search says something exists, the edges in the graph denote how/why it exists and provide extra context.
    It’s a bit frightening in practice because it starts building up “knowledge” of what everyone in our group is interested in (games, hobbies, food) and their personalities, politics, etc. Sonnet 4.6 in particular tends to query the graph and make jokes, matching the vibe on discord.
    On a more serious use-case: it also stores system topology in the graph so, while it does document the system in various READMEs and CLAUDE.md files, the graph provides a fast at a glance reference for how the systems interact. I have no evidence but I imagine this could be useful and more dense / token efficient than massive documentation, even for products, features, etc.
    mcheemaaan hour ago
    Oh I actually just looked at Graphiti and it looks really cool. I will try to see if Phantom can utilize this. Great work you guys
hmokiguess6 hours ago
Some of the other aspects of the project are quite interesting, I particularly liked https://github.com/ghostwright/shadow I think this has potential, but I am skeptical right now.
What is the actual cost of this? Can you share your real burn rate through using this, I sort of wanna try but don't want my API Key to go bananas because the agent decided it needed XYZ for "it" and didn't check with me
I get the appeal for the separate "identity" with email and everything for the agent, but then, if it has little to no supervision, what's the liability extent when it goes rogue? Say it DDoS someone, it exploits something, it does damage, is this like your child/minor and you're the parent/guardian?
- mcheemaa3 hours ago
  [dead]
scandox6 hours ago
So if I understand this it is an OpenClaw type system but based on the Claude Code Agent SDK? And they suggest installing it on a VM? Or is there more to it?
- mcheemaa6 hours ago
  Different in a few fundamental ways:
  OpenClaw runs on your machine or an ephemeral sandbox. Each session starts fresh. Phantom gets its own dedicated VM that persists. The ClickHouse instance it built three weeks ago is still running and queryable.
  OpenClaw spends a lot of tokens on screen understanding and vision loops. About 60% of its skills are basic macOS-level computer use (clicking, typing, reading the screen). Phantom skips that entirely and uses the Claude Agent SDK directly, so it gets full shell, file system, git, web search, and MCP tools natively without the token overhead of parsing screenshots.
  The biggest difference is probably dynamic MCP. Phantom registers its own MCP tools at runtime, and they persist across restarts. It built a ClickHouse REST API, registered it as a tool, and now any Claude Code session or other agent that connects to it can query that data. It builds its own capabilities and exposes them as an API.
  It also has persistent vector memory across sessions (Qdrant, local), a self-evolution engine where a different model validates every config change, and we built a companion tool called Specter (https://github.com/ghostwright/specter) that provisions VMs with DNS, TLS, and systemd in under 90 seconds, so deploying a new Phantom is genuinely three commands.
  Both are good projects, different approaches. OpenClaw does computer use well. Phantom is a persistent co-worker that lives on its own machine and compounds over time.
  - scandox5 hours ago
    > OpenClaw runs on your machine or an ephemeral sandbox. Each session starts fresh. Phantom gets its own dedicated VM that persists
    Yeah like any claw type system will be if you install it on a VM. I think the self-tooling thing is interesting but you'll gain by emphasizing that over the VM thing - at least with a technical audience.
    mcheemaa3 hours ago
    [dead]
hmokiguess6 hours ago
> Nobody asked it to build any of this. It identified analytics as useful and built the entire stack.
When I read stuff like this I am not sure how to feel.
- mcheemaa6 hours ago
  [dead]
plagiarist6 hours ago
Not sure I'd celebrate finding a library with 3 Github stars. Shouldn't the story there be vetting for quality or security?
- hmokiguess6 hours ago
  I think the intent was to say it has enough "intelligence" to find a needle in a haystack and that the "vetting" is assumed
  That said, my initial reaction was the same and if the human wasn't involved and did not do its due diligence I'm right there with you
- mcheemaa3 hours ago
  [dead]
saltpath3 hours ago
[dead]
mcheemaa7 hours ago
[dead]