It arrived at the same incredibly fun behavior as you talk about in the readme, where the agent just builds all sorts of junk for you autonomously. It has built dozens of web apps, static pages, mini games, etc. all tied back into a central domain that I gave it. I truly have no idea what the system or code looks like but it’s been so much fun just letting it build.
The “For People Who Don't Write Code” is so true as well. We have someone in discord that has never written code but they can ask the agent to build virtually anything, it goes off and churns, then pops back with a link to it running live. It’s honestly been so much fun with friends, highly recommend trying it out.
Does it not concern you if it installed a compromised package, vulnerable exploit, or it has something exposed and leaking everything to an attacker?
I understand that your personal account is removed from it, but still, it has a direct link to you, and an attacker could be just building up towards it stealthily to hit when the time is right, maybe it gains SSH into your VM or whatever
It could have installed say, that vulnerable version of litellm, and the entire VM is compromised. But it’s on an isolated vlan anyway so the worst it can really do is use bandwidth and maybe hurt my IP reputation? I could move it to a cloud VM but the risks seem minimal at the moment. I’m definitely not advocating for no defense in depth, but npm install in an isolated VM feels safer than npm install on my work laptop these days :-)
Fair point, so it's really a fancy tamagotchi you got there I guess haha
The "For People Who Don't Write Code" angle has been the biggest surprise for us. We had a non-technical user ask for a Chrome extension and the agent built it, packaged it as a zip, and sent the download link. No terminal, no dev environment needed.
If you ever want to formalize your setup, we built Specter (https://github.com/ghostwright/specter) to provision VMs with DNS, TLS, and systemd in under 90 seconds. Makes spinning up new instances trivial. Would love to hear more about your Graphiti memory setup, that's a different approach than our Qdrant-based system.
It’s a bit frightening in practice because it starts building up “knowledge” of what everyone in our group is interested in (games, hobbies, food) and their personalities, politics, etc. Sonnet 4.6 in particular tends to query the graph and make jokes, matching the vibe on discord.
On a more serious use-case: it also stores system topology in the graph so, while it does document the system in various READMEs and CLAUDE.md files, the graph provides a fast at a glance reference for how the systems interact. I have no evidence but I imagine this could be useful and more dense / token efficient than massive documentation, even for products, features, etc.
What is the actual cost of this? Can you share your real burn rate through using this, I sort of wanna try but don't want my API Key to go bananas because the agent decided it needed XYZ for "it" and didn't check with me
I get the appeal for the separate "identity" with email and everything for the agent, but then, if it has little to no supervision, what's the liability extent when it goes rogue? Say it DDoS someone, it exploits something, it does damage, is this like your child/minor and you're the parent/guardian?
OpenClaw runs on your machine or an ephemeral sandbox. Each session starts fresh. Phantom gets its own dedicated VM that persists. The ClickHouse instance it built three weeks ago is still running and queryable.
OpenClaw spends a lot of tokens on screen understanding and vision loops. About 60% of its skills are basic macOS-level computer use (clicking, typing, reading the screen). Phantom skips that entirely and uses the Claude Agent SDK directly, so it gets full shell, file system, git, web search, and MCP tools natively without the token overhead of parsing screenshots.
The biggest difference is probably dynamic MCP. Phantom registers its own MCP tools at runtime, and they persist across restarts. It built a ClickHouse REST API, registered it as a tool, and now any Claude Code session or other agent that connects to it can query that data. It builds its own capabilities and exposes them as an API.
It also has persistent vector memory across sessions (Qdrant, local), a self-evolution engine where a different model validates every config change, and we built a companion tool called Specter (https://github.com/ghostwright/specter) that provisions VMs with DNS, TLS, and systemd in under 90 seconds, so deploying a new Phantom is genuinely three commands.
Both are good projects, different approaches. OpenClaw does computer use well. Phantom is a persistent co-worker that lives on its own machine and compounds over time.
Yeah like any claw type system will be if you install it on a VM. I think the self-tooling thing is interesting but you'll gain by emphasizing that over the VM thing - at least with a technical audience.
When I read stuff like this I am not sure how to feel.
That said, my initial reaction was the same and if the human wasn't involved and did not do its due diligence I'm right there with you