The standard fixes each solve part of the problem: MEMORY.md overflows after a week, RAG can't search for things you don't know you know, and dumping everything into a 1M context window destroys attention quality while burning tokens.
Hipocampus is a 3-tier memory system (hot/warm/cold) with a 5-level compaction tree that compresses your entire conversation history into a ~100 line ROOT.md index. The agent checks the index at ~3K tokens per call to decide: search memory, search externally, or answer directly. No blind exploration.
The compaction tree (daily → weekly → monthly → root) self-compresses over months. Raw logs are permanent — nothing is ever lost. Optional hybrid search via qmd (BM25 + vector) for when you need specifics.
One command: `npx hipocampus init`
- Zero runtime dependencies, zero infrastructure, just markdown files - Works with Claude Code and OpenClaw - All memory writes via subagents (main session stays clean) - MIT licensed
Built this because I was running 80+ AI bots and got tired of them re-investigating things they already knew.