Audrey benchmark submission request
Project: https://github.com/Evilander/Audrey
Package: https://www.npmjs.com/package/audrey
Audrey is a local-first memory runtime and MCP server for AI agents with recall, memory reflexes, preflight checks, tool-trace learning, validation, contradiction handling, and SQLite/sqlite-vec storage.
We have a local benchmark report already generated with JSON/HTML/SVG artifacts, but we do not want to submit those as AMB leaderboard numbers without running the official AMB harness. We would like to add an Audrey memory provider adapter and run Audrey through AMB datasets using the official methodology.
Current local evidence available in the Audrey repo:
- benchmarks/output/summary.json
- benchmarks/output/report.html
- benchmarks/output/local-overall.svg
- benchmarks/output/retrieval-overall.svg
- benchmarks/output/operations-overall.svg
- benchmarks/output/published-locomo.svg
- benchmarks/snapshots/perf-0.22.2.json
Fresh local benchmark:
- Generated at: 2026-05-01T03:20:07.968Z
- Command: node benchmarks/run.js --provider mock --dimensions 64
- Audrey local regression suite: 100.0% score / 100.0% pass rate
Fresh perf snapshot:
- Generated at: 2026-05-01T03:39:02.431Z
- 5,000-memory hybrid recall p95: 3.6 ms with mock 64-dim embeddings
Question for maintainers:
What is the preferred submission route for a new memory provider: PR with provider adapter plus outputs, issue first, or maintainer upload after a reproducible run?
Audrey benchmark submission request
Project: https://github.com/Evilander/Audrey
Package: https://www.npmjs.com/package/audrey
Audrey is a local-first memory runtime and MCP server for AI agents with recall, memory reflexes, preflight checks, tool-trace learning, validation, contradiction handling, and SQLite/sqlite-vec storage.
We have a local benchmark report already generated with JSON/HTML/SVG artifacts, but we do not want to submit those as AMB leaderboard numbers without running the official AMB harness. We would like to add an Audrey memory provider adapter and run Audrey through AMB datasets using the official methodology.
Current local evidence available in the Audrey repo:
Fresh local benchmark:
Fresh perf snapshot:
Question for maintainers:
What is the preferred submission route for a new memory provider: PR with provider adapter plus outputs, issue first, or maintainer upload after a reproducible run?