Thank you for the insight! I think the project is in a good place with persistence. Chats are threaded, history is stored in JSON on disk + VectorDB for RAG. I have seen issues with long tool runs timing out on the client side, but complete through the LLM server side, just needs a page refresh to show the latest output, which could be a reverse proxy issue. But enhancing the workflow of the agent is high on my todo list!
The timeout issue sounds like it's probably a proxy buffer or keepalive setting. If you're running nginx, bumping proxy_read_timeout and enabling chunked transfer helps a lot for long-running streams. SSE or websockets can also help surface partial progress rather than making users wait for a full refresh. Good call prioritizing the agent workflow - that's usually where the real friction lives once persistence is solid.