What I learned analyzing 1k AI agent turns: Sessions are trajectories(unfault.dev)

1 pointby lawouach6 hours ago1 comment

lawouach6 hours ago
I spent the last few months analyzing marathon coding sessions (300 to 1,200 turns) with agents like Claude Code and OpenCode.
I wanted to solve a simple performance problem: replaying these sessions for memory indexing was too slow and expensive because I extracted information using LLM on each turn.
I ran into two learnings:
1. You can’t just skip "noise" turns for search. Raw text embeddings crushed "smart" LLM-extracted summaries (88.5% vs 73.9% recall). The signal is distributed across every turn.
2. Frustration is a lagging indicator. Failure wouldn't happen at turn 699; it would start at turn 400 (on one of my dataset). Basically, by the time you snap, the agent has been drifting for hundreds of turns. Makes sense I suppose, you're happy until you're not.
I ended up building a "trajectory regulator" that measures structural instability (logic churn, symbol repetition) in real-time to intervene before the "death spiral" kicks in. Essentially, I wanted to reduce the number of times I'd grow frustrated against the agent.
I wrote up the data and the mental model I used to build the controller. Curious if others have measured similar patterns in long-running agent sessions.