Skip to content

Latest commit

 

History

History
20 lines (13 loc) · 1.46 KB

File metadata and controls

20 lines (13 loc) · 1.46 KB

Benchmarks

We outperform Google, Parallel, and Exa on SimpleQA and FreshQA, and across domain-specific benchmarks in finance, healthcare, and economics.

FreshQA

Benchmark Valyu Parallel Exa Google
FreshQA (600 time-sensitive queries) 79 % 52 % 24 % 39 % (valyu.ai)
SimpleQA (4,326 factual questions) 94 % 93 % 91 % 38 % (valyu.ai)
Finance (120 finance questions) 73 % 67 % 63 % 55 % (valyu.ai)
Economics (100 economics questions) 73 % 52 % 45 % 43 % (valyu.ai)
MedAgent (562 complex medical queries) 48 % 42 % 44 % 45 % (valyu.ai)

Here's our benchmarking suite repo

Read the full breakdown in our benchmarking blog post.