Skip to content

fix(ruvector): CLI works on fresh DBs via meta sidecar (#417)#435

Open
ruvnet wants to merge 1 commit intomainfrom
fix/issue-417-cli-sidecar
Open

fix(ruvector): CLI works on fresh DBs via meta sidecar (#417)#435
ruvnet wants to merge 1 commit intomainfrom
fix/issue-417-cli-sidecar

Conversation

@ruvnet
Copy link
Copy Markdown
Owner

@ruvnet ruvnet commented May 7, 2026

Summary

Fix

  • Persist construction args in <dbPath>.meta.json from create; insert/search/stats read the sidecar instead of parsing redb bytes.
  • Drop calls to the phantom db.load/db.save/db.stats. Persistence is automatic via storagePath; counting goes through await db.len().
  • Every CLI handler becomes async and awaits the wrapper. benchmark numbers are now real (previously the dropped promises meant the rate was just spinner timing).
  • Coerce numeric ids to strings inside insert — the native binding rejects integer ids.
  • Surface a clear, actionable error when a DB exists without a sidecar (e.g. created by an older CLI), instead of an opaque parse failure.
  • New regression test test/cli-fresh-db.test.mjs exercises the full create → insert → search → stats path against a real redb file in os.tmpdir().

Proof

Reproduction from the issue (v0.2.25, Node 22.22.2) before this PR:

$ npx ruvector create /tmp/x.db -d 8 -m cosine
$ npx ruvector insert /tmp/x.db /tmp/v.json
✖ Failed to insert vectors
Unexpected token 'r', \"redb…\" is not valid JSON

After this PR (node test/cli-fresh-db.test.mjs):

  ok: `ruvector create` exits 0
  ok: redb file exists at dbPath
  ok: sidecar metadata file exists
  ok: sidecar.dimensions = 8
  ok: sidecar.metric = cosine
  ok: `ruvector insert` exits 0
  ok: insert does not crash JSON.parsing the redb binary
  ok: `ruvector search` exits 0
  ok: search prints `Found N results` (across stdout/stderr)
  ok: search renders at least one hit row
  ok: `ruvector stats` exits 0
  ok: stats prints Vector Count
  ok: stats fails fast on orphan DB without sidecar
  ok: orphan-DB error message mentions sidecar

ruvector fresh-DB CLI smoke OK (issue #417)

Test plan

  • create writes <dbPath>.meta.json with dimensions + metric + schema version.
  • insert reopens the redb via storagePath (no JSON.parse), reads dimensions from sidecar, coerces numeric ids, and reports Total vectors: N from await db.len().
  • search returns real hits (not undefined), respects -k, applies -t threshold post-hoc.
  • stats prints actual count from await db.len().
  • benchmark waits on every await db.search(), so the reported QPS reflects native completion (not spinner timing).
  • Orphan DB without sidecar fails fast with a sidecar-mention error.
  • Manual: npx ruvector create/insert/search against a 384-dim DB on a clean install (requires the ONNX bundling fix in fix(ruvector): bundle ONNX runtime into dist/ on build (#354) #434 to land for npx ruvector embed text to also work end-to-end).

Out of scope (deliberately)

export/import also called the phantom db.save/db.load API. Honest export needs the wrapper to grow an enumeration method (db.entries() or similar) before the handler can do real work — file-only metadata export would mislead users. Those handlers are left untouched here and tracked separately.

🤖 Generated with claude-flow

Six CLI commands crashed on every fresh database produced by
`ruvector create`:

    $ ruvector create /tmp/x.db -d 384
    $ ruvector insert /tmp/x.db /tmp/v.json
    SyntaxError: Unexpected token 'r', "redb…" is not valid JSON

Root cause: `bin/cli.js` `insert`, `search`, `stats`, `export`, and
`import` all did `JSON.parse(fs.readFileSync(dbPath, 'utf8'))` to
recover the dimension. But `<dbPath>` is a redb (Rust binary) file
managed by `@ruvector/core` — not a JSON document. The first byte
("r") tripped the parser before any other code ran.

Compounding: the same handlers called methods that don't exist on
`VectorDBWrapper` (`db.load`, `db.save`, `db.stats`) and didn't
`await` the async wrapper methods that do exist (`insert`,
`insertBatch`, `search`, `len`).

Fix:

- Persist construction args (dimensions, metric, schema version)
  in `<dbPath>.meta.json` from `create`. `insert`/`search`/`stats`
  read the sidecar and pass them straight to the wrapper
  constructor — no more JSON-parsing of redb bytes.
- Drop calls to the phantom `db.load`/`db.save`/`db.stats` API.
  Persistence is automatic via `storagePath`; counting goes through
  `await db.len()`.
- Make every CLI handler `async` and `await` the wrapper calls.
  Includes `benchmark`, whose previously-dropped promises meant the
  reported insert/search rates were just spinner timing.
- Coerce numeric ids to strings inside `insert` (the native binding
  rejects integer ids).
- Surface a clear, actionable error when a DB exists without a
  sidecar (e.g. created by an older CLI), instead of an opaque
  parse failure.

Verified end-to-end with a new test on Node 22.22.2:

    $ node test/cli-fresh-db.test.mjs
      ok: `ruvector create` exits 0
      ok: redb file exists at dbPath
      ok: sidecar metadata file exists
      ok: sidecar.dimensions = 8
      ok: sidecar.metric = cosine
      ok: `ruvector insert` exits 0
      ok: insert does not crash JSON.parsing the redb binary
      ok: `ruvector search` exits 0
      ok: search prints `Found N results`
      ok: search renders at least one hit row
      ok: `ruvector stats` exits 0
      ok: stats prints Vector Count
      ok: stats fails fast on orphan DB without sidecar
      ok: orphan-DB error message mentions sidecar

    ruvector fresh-DB CLI smoke OK (issue #417)

Out of scope (deliberately): the `export`/`import` handlers also
called the same phantom API. Those need the wrapper to grow an
enumeration method (`db.entries()` or similar) before they can do
honest work — file-only metadata-export is misleading. Tracked in a
follow-up; the existing handlers are left untouched here.

The ONNX-bundle half of #417 ships in a separate PR (#354).

Closes #417

Co-Authored-By: claude-flow <ruv@ruv.net>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

CLI commands that take a <database> path are broken on freshly-created DBs (and embed text fails on missing ONNX bundle)

1 participant