Surprising result: pre-generated Q&A pairs won overall at 81.5%. Knowledge graph came second at 70.3%. RAPTOR and naive vector were both under 25%. The most sophisticated architectures were also the slowest — hybrid took 39.5s/query at $3.00/run and still lost to a simple Q&A lookup at 8.4s/$0.48.
The tool is bring-your-own-docs. Point it at Markdown, HTML, PDF, Word, CSV, or a URL, run the pipeline, get your own numbers. Comes with the AWS corpus as a built-in example so pip install kb-arena && kb-arena demo works with no API keys.
Result: pre-generated Q&A pairs won at 81.5%. Knowledge graph came second at 70.3%. Naive vector scored 19.5%. The hybrid approach took 39.5s/query at $3/run and still lost to Q&A pairs at 8.4s/$0.48.
Bring-your-own-docs — Markdown, HTML, PDF, Word, CSV, or URL. pip install kb-arena && kb-arena demo runs with no API keys using bundled results.