We outperform Google, Parallel, and Exa on SimpleQA and FreshQA, and across domain-specific benchmarks in finance, healthcare, and economics.
| Benchmark | Valyu | Parallel | Exa | ||
|---|---|---|---|---|---|
| FreshQA (600 time-sensitive queries) | 79 % | 52 % | 24 % | 39 % | (valyu.ai) |
| SimpleQA (4,326 factual questions) | 94 % | 93 % | 91 % | 38 % | (valyu.ai) |
| Finance (120 finance questions) | 73 % | 67 % | 63 % | 55 % | (valyu.ai) |
| Economics (100 economics questions) | 73 % | 52 % | 45 % | 43 % | (valyu.ai) |
| MedAgent (562 complex medical queries) | 48 % | 42 % | 44 % | 45 % | (valyu.ai) |
Here's our benchmarking suite repo
Read the full breakdown in our benchmarking blog post.
