TheBengaluruGuy4 hours ago
We built an AI agent for SRE/production operations (DrDroid) and found that standard RAG/embedding-based retrieval didn't work well for engineering contexts — keywords and jargons have no semantic meaning, and a single character difference (e.g. us-east-1 vs us-east-2) can mean completely unrelated things.
This writeup explains how we designed Dynamic Memory Retrieval (DMR) — a multi-layered agentic search system over 80+ integrations (Grafana, Datadog, K8s, AWS, etc.) that indexes 200+ record types and enables the agent to iteratively discover and extract relevant context during production investigations.
Key takeaways: why keyword search outperformed embeddings for our domain, how we structure short-term vs long-term memory, and what it takes to make an agent reliably navigate a company's entire production stack.