Quick question: Does this work with any vector database (Pinecone, Weaviate, etc.) or is it specific to certain backends?
Also curious how you're measuring relevance - are you using LLM-as-judge or some other scoring method?
This could be really useful for optimizing chunk size and overlap settings!
*Backends:* Currently supports Qdrant, pgvector, Weaviate, Chroma, and Pinecone. Adding more is straightforward since it's just implementing a Store interface. Let me know if I missed some good backend!
*Relevance scoring:* No LLM-as-judge — that's intentional. RagTune focuses on retrieval-layer metrics only:
- Vector similarity scores (what the DB returns) - Recall@K, MRR against your golden set - Score distribution diagnostics
The philosophy is: debug retrieval separately from generation. If your retrieval is broken, no amount of prompt engineering will fix it.
For chunk size/overlap optimization — exactly the use case! `ragtune compare --chunk-sizes 256,512,1024` lets you see the impact directly.
Happy to hear feedback if you try it!