Skip to content

fix(ruvector): bundle ONNX runtime into dist/ on build (#354)#434

Open
ruvnet wants to merge 1 commit intomainfrom
fix/issue-354-bundle-onnx-pkg
Open

fix(ruvector): bundle ONNX runtime into dist/ on build (#354)#434
ruvnet wants to merge 1 commit intomainfrom
fix/issue-354-bundle-onnx-pkg

Conversation

@ruvnet
Copy link
Copy Markdown
Owner

@ruvnet ruvnet commented May 7, 2026

Summary

Root cause

"build": "tsc && cp src/core/onnx/pkg/package.json dist/core/onnx/pkg/"

tsc only emits compiled .ts output (no allowJs), so the JS sources under src/core/onnx/ (the wasm-bindgen .wasm payload, _bg.js, type defs, LICENSE, sibling loader.js) never reach dist/. The script copied a single package.json and called it a day.

Fix

  • Replace the single-file copy with scripts/copy-onnx-assets.js, a Node-portable recursive copy (no cp -r, works on Windows).
  • Skip dotfiles (e.g. transient .claude-flow/ agent metadata) and node_modules/ so they don't leak into the published artifact.
  • Sanity-check that the canonical runtime files (*_bg.{js,wasm}, *.js, loader.js) landed where onnx-embedder.js looks for them; fail the build loudly if not.

Proof

On a fresh build of the current 0.2.25 working tree (Node 22.22.2):

$ rm -rf dist/core/onnx && npm run build
> tsc && node scripts/copy-onnx-assets.js
copy-onnx-assets: 10 ONNX runtime file(s) staged under dist/.

$ ls dist/core/onnx/pkg/
LICENSE                                  ruvector_onnx_embeddings_wasm_bg.js
loader.js                                ruvector_onnx_embeddings_wasm_bg.wasm
package.json                             ruvector_onnx_embeddings_wasm_bg.wasm.d.ts
ruvector_onnx_embeddings_wasm.d.ts       ruvector_onnx_embeddings_wasm.js

$ npm pack && tar -tzf ruvector-0.2.25.tgz | grep -c 'dist/core/onnx/pkg'
8

# Clean install in /tmp:
$ node -e "const {isOnnxAvailable} = require('ruvector/dist/core/onnx-embedder'); console.log(isOnnxAvailable())"
true

Tarball grew from 825 KB → 3.1 MB; that's the 7.4 MB MiniLM WASM payload (uncompressed) compressing to ~2.3 MB on top of the 825 KB baseline — expected since this is the file the consumer was previously missing.

Test plan

  • npm run build lands all 10 ONNX runtime files under dist/core/onnx/.
  • npm pack includes them in the tarball (grep -c onnx/pkg = 8).
  • Clean install of the packed tarball makes isOnnxAvailable() return true (was false/throw before).
  • Build does not leak transient .claude-flow/ dotfiles into the artifact.
  • Script doesn't depend on cp (portable to Windows).
  • Optional follow-up: extend verify-dist.js to assert the ONNX runtime files exist (best done after fix(ruvector): verify-dist guards package.json entrypoints (#376) #433 lands to avoid merge conflicts).

Scope notes

🤖 Generated with claude-flow

The published tarball was missing every ONNX runtime file except a
1-line `package.json`, so `OptimizedOnnxEmbedder` (and any code path
that calls `initOnnxEmbedder()`) crashed on every clean install with:

    Error: ONNX WASM files not bundled. The onnx/ directory is missing.

Root cause is the build script:

    "build": "tsc && cp src/core/onnx/pkg/package.json dist/core/onnx/pkg/"

`tsc` only emits compiled `.ts` output (no `allowJs`). The wasm-bindgen
artifacts under `src/core/onnx/pkg/` (the .wasm payload, _bg.js, type
defs, LICENSE) and the sibling `src/core/onnx/loader.js` are runtime
JavaScript — `tsc` doesn't relay them — but the script only copied a
single `package.json`. Everything else stayed in `src/` and never made
it into the tarball.

Fix:

- Replace the single-file copy with `scripts/copy-onnx-assets.js`, a
  Node-portable recursive copy (works on Windows; doesn't need cp).
- Skip dotfiles (e.g. transient `.claude-flow/` agent metadata) and
  `node_modules/` so they don't leak into the published artifact.
- Sanity-check that the canonical runtime files (`*_bg.{js,wasm}`,
  `*.js`, `loader.js`) landed where `onnx-embedder.js` looks for them;
  fail the build loudly if not.

Verified end-to-end against ruvector@0.2.25 on Node 22.22.2:

    $ rm -rf dist/core/onnx && npm run build
    > tsc && node scripts/copy-onnx-assets.js
    copy-onnx-assets: 10 ONNX runtime file(s) staged under dist/.

    $ ls dist/core/onnx/pkg/
    LICENSE                                  ruvector_onnx_embeddings_wasm_bg.js
    loader.js                                ruvector_onnx_embeddings_wasm_bg.wasm
    package.json                             ruvector_onnx_embeddings_wasm_bg.wasm.d.ts
    ruvector_onnx_embeddings_wasm.d.ts       ruvector_onnx_embeddings_wasm.js

    $ npm pack && tar -tzf ruvector-0.2.25.tgz | grep -c onnx/pkg
    8

    # Clean install into /tmp:
    $ node -e "const {isOnnxAvailable} = require('ruvector/dist/core/onnx-embedder'); console.log(isOnnxAvailable())"
    true

Closes #354

Co-Authored-By: claude-flow <ruv@ruv.net>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ONNX WASM files not bundled on clean insteall

1 participant