Some searches work like magic and others seem to veer off target a lot. For example, "sculpture" and "watercolor" worked just about how I'd expect. "Lamb" showed lambs and sheep. But "otter" showed a random selection of animals.
The search is in beta and we improving the model. Thank you for reporting the queries which are not working well.
Edit: Re the otter, I just checked and I did not found otters in the dataset. We should not return any results if the model is not sure to reduce confusion.
I also expected semantic search to return similar results for "fireworks" and "pyrotechnics," since the latter is a less common synonym for the former. But I got many results for fireworks and just one result for pyrotechnics.
This is still impressive. My impulse is to poke at it with harder cases to try to reason about how it could be implemented. Thanks for your Show HN and for replying to me!
Would be interesting to know how relevant that approach is now.
For example, if you search for “paintings of winter landscapes but without sun and trees,” you’ll still get results with trees. That’s because embeddings capture the presence of concepts like “tree” or “landscape,” but not logical relationships like “without” or “not.”
Similarly, embeddings aren’t great at capturing how something feels. They can tell that “sad poem” and “happy poem” are different mainly because of the words used, not because they truly understand emotional tone.
This happens because most embedding models (like OpenAI’s or sentence-transformers) are trained to group things by semantic similarity, not logical meaning or sentiment. Negation, polarity, and affect aren’t explicitly represented in the vector space.
Might be common knowledge to some, but it was a cool TIL moment for me, realizing that embeddings are great at what something is about, but not how it feels or what it excludes.
I don’t know enough of the underlying math to say for sure whether embeddings can be trained to consistently represent negation, but when I tried the Mixedbread demo myself with a query like “winter landscapes without sun and trees”, it still showed me paintings with both sun and trees. So at least in its current form, it doesn’t seem to fully handle those semantic relationships yet.
Colnomic and nvidia models are great for embedding images and MUVERA can transform those to 1D vectors.
“the pipeline” - seems like this is just a personal hackathon project?
Why these models vs other multimodals? Which “nvidia models”?
I couldn't find any examples that couldn't be explained by simple text matches.
"Images of german shepherds" never fails to provide some humor.
This NGA link returns over a thousand pieces by Rothko: https://www.nga.gov/artists/1839-mark-rothko/artworks
The next version can see the image and read the metadata.
A bit more context: We are include everything in the latent space (embeddings) without trying to maintain multiple indexes and hack around things. There is still a huge mountain to climb. But this one seems really promising.
Since this is a semantic search, using a vector embedding, it will handle meanings better than a text search, which would handle names better.