Skip to content

Add KB Arena retrieval benchmark notebook#290

Open
xmpuspus wants to merge 1 commit into
deepset-ai:mainfrom
xmpuspus:add-kb-arena-benchmark
Open

Add KB Arena retrieval benchmark notebook#290
xmpuspus wants to merge 1 commit into
deepset-ai:mainfrom
xmpuspus:add-kb-arena-benchmark

Conversation

@xmpuspus
Copy link
Copy Markdown

Adds a new cookbook notebook: benchmark_retrieval_strategies_kb_arena.ipynb.

What it shows

Before wiring a retrieval approach into a Haystack pipeline, the notebook walks through using KB Arena to compare retrieval architectures on a small example corpus and then map the winning strategy to Haystack's InMemoryBM25Retriever.

KB Arena benchmarks nine architecturally distinct retrieval strategies head-to-head and reports IR metrics with paired-bootstrap 95% CIs and Wilcoxon p-values:

  • naive vector, contextual vector, QnA pairs, knowledge graph (Neo4j), hybrid RRF, RAPTOR, PageIndex, BM25, rerank-vector (cross-encoder)
  • Recall@k, Precision@k, MRR, NDCG (binary + graded), MAP, R-Precision, bpref, RBO
  • Pareto frontier across (NDCG, latency)

What's in the PR

  • notebooks/benchmark_retrieval_strategies_kb_arena.ipynb — 18 cells covering install, corpus prep, ingest, retriever-lab, results inspection, and Haystack pipeline wiring. Runs end-to-end with no API keys (BM25-only path).
  • index.toml — new [[cookbook]] entry tagged ["Evaluation", "Advanced Retrieval", "RAG"] with new = true.

Notes

  • Uses a tiny inline corpus (three AWS-flavored Markdown files) so the notebook is self-contained and fast.
  • The full nine-strategy benchmark requires Neo4j and an embedding provider; the notebook calls those out as next steps rather than requiring them.
  • KB Arena: MIT, on PyPI as kb-arena, Zenodo concept DOI 10.5281/zenodo.20319678.

Filename convention matches the existing pattern (descriptive, includes the technologies involved). Follows the cookbook README's three contribution rules: notebook in /notebooks, descriptive filename, index.toml entry with title and topics.

@xmpuspus xmpuspus requested a review from a team as a code owner May 24, 2026 01:11
@review-notebook-app
Copy link
Copy Markdown

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant