2 pointsby mips_avatar6 hours ago2 comments
  • nuky5 hours ago
    Sounds interesting. What's the stack under the hood?
    • mips_avatar5 hours ago
      Mostly usearch since pgvector had some perf problems, though I use postgres during my ingestion stage since postgis is so good. I segment my hnsw indexes by multipe h3 levels.
      • nuky5 hours ago
        Nice. And how do you handle candidates near H3 boundaries?
        • mips_avatar5 hours ago
          So I don't support queries with a radius larger than 50km (if an AI agent doesn't know where it's looking within 50km there usually is a context issue upstream), but i have a larger h3 index and a tighter h3 index. Then I have a router that tries to find the correct h3 indexes for each query. For some queries I'll need up to 3 searches, but most map to a single search. (sorry I probably won't be able to reply below here since the max hn comment depth is 4)
          • mips_avatar5 hours ago
            Reply to your comment below this (since hn limits comment depth to 4). The 40ms latency is an average but 90% of queries are getting routed to a single index, latency is worse when the routing goes to 3. Since I already batch the embedding generation I should be able to get hard queries down to like 50ms.
          • nuky5 hours ago
            Makes sense. What about latency? for typical and hard queries
  • mips_avatar6 hours ago
    Author here if you have any questions