Node.js bindings for Nvidia cuVS > GPU-accelerated vector search(github.com)

2 pointsby gmatt2 hours ago1 comment

gmatt2 hours ago
My primary stack is Javascript (React/NodeJS/noSQL - Mongo a lot, Dynamo as a second, postgres because ... work). While all that was dandy for work, all this AI that rose in past 3-5 years is heavily centric to Python. I remember when assessing vectorstores and it was really frustrating - especiall trying to use Professional vectorstores like Pinecone.
Pinecone's early SDK support for Node.js was frustrating to say the least - mostly because I really liked their performance.
hnswlib is something I used a lot - I have done a fair bit of work in local/remote index building and benchmarking it - and taking any opensource tooling and building indexes locally was always way slower than say making a call into Pinecone. Anyhow, so due to work and personal interest, last 3-4 years have been in the datastore/vectorstore realm. I also have had access to some significant GPU compute. I am happy to talk shop with anybody who wants...
OK, now for the meat and potatoes: building indexes on CPUs is a complete nogo, my benchmarks of GPU/CPUs show numbers so comical that normally ppl think tests must be wrong.
NVIDIA cuVS is the library behind vector search in Elasticsearch, Weaviate, Milvus, and Oracle. It has bindings for Python, Rust, Java, Go. Nothing for Node.js. NVIDIA tried once with node-rapids in 2021, it seems to me it abandoned it in 2023. - https://github.com/rapidsai/node
So I built cuvs-node. Native C++ N-API bindings to the cuVS C API. There's five algorithms (CAGRA, IVF-Flat, IVF-PQ, brute-force, HNSW). 119 tests. Verified on A10, A100, H100, GH200, and B200.
I have a ton of benchmarks of GPU vs CPU - altho the very itneresting ones are among the providers. The difference in performance is actually shocking - despite most of them claiming state of the art infra.
Following benchmarks were completed in same session, on A100 SXM (same machine, GPU vs CPU):
1M vectors at 768 dimensions: 5.3s on GPU vs 65 minutes on CPU (hnswlib-node). 733x faster. Search: sub-2ms at 1M vectors.
Open source, Apache 2.0. Requires Linux with NVIDIA GPU and CUDA. Prebuilt binaries on the roadmap. Happy to answer questions.