The index is built on Linux from the GoogleNews Word2Vec model, then copied to an SD card. The ESP32-S3 runs search-only for the demo: it receives a 300-dimensional float32 query vector over TCP, searches the nn20db HNSW graph local from SD card, and shows the results.
The demo supports normal word queries and vector arithmetic such as:
Paris - France + Poland
The full GoogleNews model contains around 3 million vectors. In my current test, one search on the ESP32-S3 takes about 12 seconds (ef 15).
This is not meant to beat server/vector-db performance. The point is to explore what useful ANN/vector search can look like on tiny offline hardware with limited RAM and cheap storage.