Some background: I run a multi-agent AI system (orchestrator + specialized agents) across multiple machines. Two things kept biting me:
1. *Context dilution:* In long sessions, earlier context gets compressed or dropped. The agent "forgets" decisions made hours ago — not because the session ended, but because the context window silently pushed them out.
2. *Vendor/machine lock-in:* I switch between Claude Code, Gemini CLI, and OpenCode depending on the task. And I work on two PCs. CLAUDE.md only works in Claude Code, on one machine. There was no way to carry knowledge across tools or devices.
hmem solves both: it's a single .hmem file (SQLite) that any MCP-compatible tool can read/write. Same memory, any tool, any machine.
The lazy loading is key — the agent never reads the database directly. It makes tool calls and gets back only what it asked for. A typical session start costs ~20 tokens for the L1 overview. Drilling into one specific topic costs ~80 tokens. Compare that to a MEMORY.md that injects 3000-8000 tokens wholesale every time.
Technical details: - SQLite backend (better-sqlite3), one .hmem file per agent - 5-level hierarchy with dot-path addressing (e.g., L0003.2.1) - Entries are auto-timestamped, support time-range queries - Full-text search across all levels - Configurable category prefixes (defaults: P=Project, L=Lesson, E=Error, D=Decision, M=Milestone, S=Skill, F=Favorite) - Favorites are always loaded at depth 2 (pinned context) - Includes integrity checks — auto-backup on corruption detection
What's next: - Cloud sync between machines (encrypted, probably git-based) - Memory forks — think GitHub repos but for agent memories (fork a curated react-patterns.hmem as a starting point) - Better onboarding docs and a demo video
This is a genuine beta — I use it daily but it hasn't been battle-tested by others yet. If you try it, I'd love to hear what breaks or what's confusing about the setup.