Show HN: Multimodal file search and sharing for AI agents(claw3drive.com)

2 pointsby mnmatin3 hours ago1 comment

mnmatin3 hours ago
Hi HN, we are Matin and Daniil, two indie researchers working on geometric deep learning.
We built ClawDrive because we were frustrated with the current workflow of using agents to search and share files. We found ourselves to be renaming, organising, transcribing video & audio files manually to use with our Agents. To share files with each other, we have to upload them to google drive first. Searching files using an image/video as a query was also not possible.
With ClawDrive we index files (text, images, audio, video) into a shared embedding space using gemini-embedding-2 which allows for cross-modal retrieval.
We built a tiered retrieval system (inspired by OpenViking) where every file gets a one-sentence TLDR and a structured Digest. Agents can read the TLDRs first, view the Digest and only then pull the full file when they actually need it.
For sharing files we create the concept of 'pots' which are analogous to shared folders. You can share them peer-to-peer via tunnels (Tailscale, Cloudflare) and the agent at the receiving end also receives the embeddings, TLDRs and digest alongside.
ClawDrive does not bundle any LLMs, instead the Agent has access to a 'todo' list of items it has to carry out such as renaming files, creating taxonomies/TLDRs/digest and transcribing audio/video files.
You can view a cool 3D (UMAP) demo here: https://app.claw3drive.com/
Github Repo: https://github.com/Hyper3Labs/clawdrive
We are happy to answer any questions - Matin & Daniil