3 pointsby cha0tikdino6 hours ago1 comment
  • cha0tikdino6 hours ago
    For the last 6 months I've been running a personal Claude agent as a systemd service on a Beelink mini PC. It gives me a morning briefing in Discord, monitors my email, tracks my finances, and actually remembers things across conversations — not just the last session. Here's how the memory system works and why the naive approach falls apart.

    The memory problem

    Every agent tutorial I found did one of two things: forgot everything on restart, or shoved the full conversation history into every prompt until it hit the context limit and broke. Neither is usable long-term.

    What actually works is three tiers:

    - Core Memory — permanent facts about you, always in the system prompt, kept small (~500 chars). Who you are, current tasks, standing preferences. - Recall — every conversation logged to SQLite, FTS5 searchable on demand. The agent can look back, but it doesn't have to carry it all. - Archival — long-term knowledge store, same FTS5 search. Stuff like "user doesn't like meetings before 10am" or "the Plaid API returns dates in UTC not local time."

    The key insight: Core Memory is always in context. Everything else is searched when needed. After 6 months of daily use the system prompt is still ~2KB.

    The MCP subprocess thing

    The Claude Agent SDK lets you expose tools as MCP servers. The natural approach is an in-process bidirectional channel — but there's a race condition on it (issue #148 in the SDK repo) that causes intermittent crashes under load.

    The fix: run the memory MCP server as a standalone stdio JSON-RPC subprocess. The parent spawns it at startup and communicates over stdin/stdout. It's a few extra lines but eliminates the crash entirely. The skeleton in the repo does it this way.

    Loop detection

    Tool call loops are a real problem. Agent tries a bash command, it fails, agent tries the exact same command, it fails again, repeat until you've burned $3 and gotten nothing. The fix is trivial once you know to do it: track the last N tool call names in an array. If the last 3 are identical, call q.interrupt() and bail with an explanation.

    What's in the repo

    The GitHub repo is a working skeleton — the core loop, SQLite memory system, and a terminal REPL. npm install && npm start and you're talking to an agent that already has persistent memory. It type-checks clean on Node 22.

    What it doesn't have: Discord integration, cron jobs, the systemd service setup, semantic memory search, or the pitfalls I hit over 6 months. I wrote all of that up in a guide — link in the README if you want the full thing, but the skeleton runs standalone.

    Cost

    Claude Max subscription is $20/month flat with no per-token charges. Server is a Beelink mini PC pulling ~15W — call it $2–3/month in electricity. Total: ~$25/month for something that never sleeps, has real tool access, and actually knows who you are.

    Happy to answer questions about the architecture.