As context grows, models are forced to continuously re-derive what should be stable state (entities, facts, local invariants) from attention alone. Over long horizons, this implicit reconstruction inevitably drifts. From that angle, Engram’s value isn’t just O(1) lookup
it’s that state is externalized, managed outside the core reasoning path, and injected only when relevant.
That separation feels important. Instead of hoping attention can both reason and maintain coherence indefinitely, the model gets a way to re-anchor itself to stable references. In that sense, this looks less like “better recall” and more like a step toward explicit state management for LLMs.
I suspect architectures that treat memory as an external, selectively injected primitive will matter more for long-running agents than further scaling context windows.