I've been trying to build some agent stuff for my own projects lately, mostly running locally on my laptop with Ollama because I don't want everything in the cloud. For production deployments and local runs, I've found popular frameworks tend to be on the heavy side – slower startups, higher resource usage, more complex offline RAG/memory setup, and fewer built-in safeguards for reliability issues like loops or failures.
So I've been hacking on something much simpler for myself: focuses on Ollama/local embeddings, the usual reasoning patterns (ReAct, Plan-and-Execute, etc.), vector + graph RAG, session memory, plus some basic reliability things like circuit breakers and retries. Cold starts are quick, works in Docker, etc.
I'm really curious what others are running into:
Anyone else hitting similar issues with frameworks like LangChain, CrewAI, AutoGen in production?
For local/offline agents, what features matter most to you for reliability?
How do you handle stability in real deployments?
Would love to hear your experiences – thanks!