Skip to content

feat(search): CJK tokenizer for BM25#362

Merged
rohitg00 merged 2 commits into
mainfrom
feat/cjk-tokenizer-344
May 13, 2026
Merged

feat(search): CJK tokenizer for BM25#362
rohitg00 merged 2 commits into
mainfrom
feat/cjk-tokenizer-344

Conversation

@rohitg00
Copy link
Copy Markdown
Owner

@rohitg00 rohitg00 commented May 13, 2026

Closes #344.

Why

src/state/search-index.ts tokenizes with /[^\p{L}\p{N}\s/.\\-_]/gu, which since v0.9.12 (#327) handles Greek, Cyrillic, Hebrew, Arabic, and accented Latin correctly because those scripts split on whitespace. CJK doesn't — Chinese / Japanese / Korean memories tokenized as a single sentence-long token, so BM25 returned the whole memory or nothing on a one-word CJK query. Vector still worked, but the hybrid score collapsed to vector-only for CJK users.

What

New src/state/cjk-segmenter.ts module. tokenize() checks each whitespace-split run with hasCjk() and routes to the right segmenter when needed; non-CJK runs go through the existing stemmer path unchanged.

Three segmenter paths, dispatched by Unicode script over \p{Script=Han} / \p{Script=Hiragana} / \p{Script=Katakana} / \p{Script=Hangul}:

Script Segmenter Notes
Han (Chinese) @node-rs/jieba Native module, no model download. Jieba.withDict(dict) from the bundled @node-rs/jieba/dict, then .cut(text, hmm=true).
Kana + mixed kanji (Japanese) tiny-segmenter Pure JS, ~25KB. A run that contains any Hiragana / Katakana is treated as Japanese even if it also contains kanji, so mixed Japanese text like プロジェクト記憶 does not get mis-routed to jieba.
Hangul (Korean) rule-based No dep. Each [가-힯]+ syllable-block run is one token.

Optional-deps soft-fail

@node-rs/jieba and tiny-segmenter are declared in optionalDependencies so they install by default but don't break a constrained install. The segmenter wraps require() (via createRequire(import.meta.url)) in try/catch — if a dep is missing, it returns the whole run as a single token (the pre-fix behavior) and emits a one-time stderr hint per language using a module-level Set<string>:

agentmemory: install @node-rs/jieba to improve Chinese search; falling back to whole-string tokenization

Fixtures added

In test/search-index.test.ts:

  • Chinese: index 项目记忆存储 then query 项目 -> non-zero BM25 score on the memory.
  • Japanese: index プロジェクト記憶 then query プロジェクト -> hit.
  • Korean: index 프로젝트 메모리 저장소 then query 메모리 -> hit.

Tests

Existing test count grew from 899 to 902 (Test Files 83 passed (83) | Tests 902 passed (902) for the BM25 + search-related path). Greek and ASCII tests stay green. The unrelated 10 failures in test/mcp-standalone.test.ts are pre-existing on main and predate this PR.

Out of scope

  • Thai / Lao / Khmer (different segmentation problem, separate issue).
  • Per-language stemming. BM25 here stays light-weight by design.

Summary by CodeRabbit

  • New Features

    • Improved tokenization and search support for Chinese, Japanese, and Korean text, preserving mixed CJK/non-CJK token order.
  • Documentation

    • Clarified CJK tokenization behavior, coverage for non‑Latin scripts, and guidance to install optional segmenters for better CJK results.
  • Chores

    • Added optional segmentation packages to enable improved CJK tokenization when available.
  • Tests

    • Added CJK-focused tests to verify indexing and search behavior.

Review Change Stack

BM25 previously tokenized Chinese, Japanese, and Korean memories as a
single sentence-long token because CJK scripts don't put spaces between
words, leaving the keyword leg of hybrid search as dead weight for CJK
users. Greek, Cyrillic, accented Latin already work since v0.9.12 (#327)
because they split cleanly on whitespace.

Detect CJK by Unicode block, then route per-run:
- Han: @node-rs/jieba (native, no model download), Jieba.cut(text, hmm)
- Kana (with mixed kanji): tiny-segmenter (pure JS, ~25KB)
- Hangul: rule-based, each [가-힯]+ run is one token

Both segmenters declared in optionalDependencies so the bundle stays
small for users who don't need them. If the dep isn't installed the
segmenter soft-falls to whole-run tokenization (the pre-fix behavior)
and prints a one-time stderr hint per language. Non-CJK input goes
through the existing tokenize path unchanged — zero behavior change for
the Latin/Greek/Cyrillic/Hebrew/Arabic paths.

New fixtures in test/search-index.test.ts cover Chinese (项目记忆存储),
Japanese (プロジェクト記憶), and Korean (프로젝트 메모리 저장소). Existing
Greek + ASCII tests stay green. Suite grew 899 -> 902 tests.
@vercel
Copy link
Copy Markdown

vercel Bot commented May 13, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
agentmemory Ready Ready Preview, Comment May 13, 2026 8:20pm

Request Review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 13, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: baa5cb71-de22-41ba-84f8-5286dccbf8f4

📥 Commits

Reviewing files that changed from the base of the PR and between dda6682 and 8044274.

📒 Files selected for processing (2)
  • src/state/cjk-segmenter.ts
  • test/search-index.test.ts
🚧 Files skipped from review as they are similar to previous changes (2)
  • test/search-index.test.ts
  • src/state/cjk-segmenter.ts

📝 Walkthrough

Walkthrough

Adds CJK detection/segmentation and integrates it into SearchIndex.tokenize(), declares optional CJK segmenters in package.json, updates README, and adds tests covering Chinese, Japanese, and Korean indexing/search.

Changes

CJK Tokenization for BM25

Layer / File(s) Summary
CJK Segmentation Engine
src/state/cjk-segmenter.ts
Unicode regexes detect Han, Hiragana/Katakana, and Hangul. Lazy-loads optional @node-rs/jieba and tiny-segmenter with one-time stderr hints. Implements hasCjk, detectScript, per-script segmentation (segmentHan, segmentKana, segmentHangul), central segmentCjk(), and __resetCjkSegmenterStateForTests().
SearchIndex Tokenization Integration
src/state/search-index.ts
Imports hasCjk/segmentCjk; rewrites tokenize() to skip short tokens, route CJK tokens through segmentCjk() (emit segments), otherwise apply existing stem(); indexing/search behavior unchanged otherwise.
CJK Language Search Tests
test/search-index.test.ts
Adds tests that index Chinese, Japanese, and Korean observations and assert single-word CJK queries return the expected obsId with a positive BM25 score; adds a segmentCjk order-preservation test.
Dependencies and Documentation
package.json, README.md
Adds @node-rs/jieba@^2.0.1 and tiny-segmenter@^0.2.0 to optionalDependencies. README note describing BM25 tokenization coverage for various scripts and guidance for installing optional CJK segmenters; documents fallback to whole-run tokenization with a one-time stderr hint.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • rohitg00/agentmemory#327: Modifies src/state/search-index.ts tokenization pipeline to use Unicode-aware normalization, overlapping with this PR's tokenization changes.

Poem

🐰 I nibble at code by moonlight,
Han, Kana, Hangul split just right,
Jieba hums and tiny cuts through,
Tokens now hop in ordered queue,
Search finds memories in new light.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely identifies the main feature addition: CJK tokenizer support for BM25 search, which is the primary focus of all code changes.
Linked Issues check ✅ Passed The PR fully implements all coding requirements from issue #344: CJK script detection, per-script segmenter routing (Jieba for Han, tiny-segmenter for Kana, rule-based for Hangul), optional dependencies, soft-fail fallback with stderr hints, and test coverage for Chinese/Japanese/Korean queries.
Out of Scope Changes check ✅ Passed All code changes align with issue #344 scope: CJK segmentation implementation, optional dependency additions, README documentation, and test fixtures—with Thai/Lao/Khmer and per-language stemming correctly excluded.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/cjk-tokenizer-344

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 ESLint

If the error stems from missing dependencies, add them to the package.json file. For unrecoverable errors (e.g., due to private dependencies), disable the tool in the CodeRabbit configuration.

ESLint skipped: no ESLint configuration detected in root package.json. To enable, add eslint to devDependencies.

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/state/cjk-segmenter.ts`:
- Around line 130-154: segmentCjk currently processes HAN/KANA runs, HANGUL
runs, then non-CJK pieces in separate loops which reorders tokens; change it to
a single left-to-right pass so tokens remain in original order: iterate over the
string using a single combined matcher (e.g., matchAll on CJK_RUN_RE or build a
regex that matches HAN_KANA_RUN_RE and HANGUL_RUN_RE alternately), track match
start/end indices, for each match push any intervening non-CJK substring
(trimmed) then segment the matched run in place by calling segmentKana,
segmentHan or segmentHangul based on which regex matched (use
HAN_KANA_RUN_RE.test and HANGUL_RUN_RE.test on the run) and push those tokens,
and after the loop push any trailing non-CJK piece; update segmentCjk to use
this approach so token order is preserved.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: ccc4aaf5-d9cf-4d60-b97a-a78687642b48

📥 Commits

Reviewing files that changed from the base of the PR and between 96c0ed0 and dda6682.

📒 Files selected for processing (5)
  • README.md
  • package.json
  • src/state/cjk-segmenter.ts
  • src/state/search-index.ts
  • test/search-index.test.ts

Comment thread src/state/cjk-segmenter.ts
The previous segmentCjk walked the input three times — once for Han/kana
runs, once for Hangul runs, once for non-CJK pieces — and concatenated
the results. For a mixed string like "abc 메모리 def 项目 ghi" that
emitted ["메모리","项目","abc","def","ghi"], dropping the original
positional information BM25 phrase queries and any downstream consumer
keyed on token order rely on.

Replace with a single left-to-right pass over CJK_RUN_RE: emit any
intervening non-CJK substring (trimmed), then route the matched CJK run
to segmentHangul / segmentKana / segmentHan based on which script it
contains, and finally emit any trailing non-CJK piece.

Added a regression test asserting source order across four mixed-script
strings. Drops the now-unused HAN_KANA_RUN_RE / HANGUL_RUN_RE regex
constants.
@rohitg00 rohitg00 merged commit 8c3418c into main May 13, 2026
5 checks passed
@rohitg00 rohitg00 deleted the feat/cjk-tokenizer-344 branch May 13, 2026 21:20
rohitg00 added a commit that referenced this pull request May 13, 2026
Templates previously did `FROM rohitghumare64/agentmemory:latest`, an
image that does not exist on Docker Hub. agentmemory ships via npm
(`@agentmemory/agentmemory`) and runs against the `iii` engine binary
fetched from the `iii-hq/iii` GitHub release. There is no first-party
agentmemory Docker image yet.

Rewrote all three Dockerfiles to be self-contained against
`node:22-slim`:

- `apt-get` curl / openssl / ca-certificates / tini for the runtime
  prereqs (openssl for the first-boot HMAC, tini for clean PID 1).
- Download the iii binary from
  `github.com/iii-hq/iii/releases/download/iii/v${III_VERSION}/...`
  matching the build host's arch (uname -m).
- `npm install -g @agentmemory/agentmemory@${AGENTMEMORY_VERSION}` with
  `--omit=optional` so the CJK native deps (added in PR #362) don't
  fail the build on platforms without musl/glibc prebuilds.
- `mkdir -p /data && chown node:node /data` so the existing
  first-boot entrypoint can write `/data/.hmac` without root.
- `USER node:node` (UID 1000, the standard non-root user in the node
  image) instead of the made-up `65532:65532` that assumed a
  distroless base.

Both `AGENTMEMORY_VERSION` and `III_VERSION` are `ARG`s so platform
operators can pin a specific release through their dashboard's
build-args UI without editing the Dockerfile.

README updates:

- Top-level `README.md` and `deploy/README.md` no longer claim users
  pull a pre-published image; the wording now matches what each
  template actually does (build from source on the platform's builder).
- Per-platform "Known caveats" sections drop the
  `rohitghumare64/agentmemory:latest` line and replace it with notes
  about build-time, cache reuse, and how to pin via build args.
- `deploy/render/README.md` drops the obsolete `&imgURL=...` Deploy
  Hook query-string trick and replaces it with
  `AGENTMEMORY_VERSION` build-arg guidance.
- `deploy/README.md` switches "Integration clients (Hermes, OpenClaw,
  pi)" → "Integration plugins" to match the scrub convention from the
  earlier PR-body edit.
rohitg00 added a commit that referenced this pull request May 14, 2026
* feat(deploy): add fly.io / Railway / Render one-click templates

Closes #343.

Lets operators stand up agentmemory on managed infrastructure without
rolling their own Docker host. Each template extends the published
rohitghumare64/agentmemory:latest image, mounts persistent storage at
/data, generates the HMAC secret on first boot via openssl rand into
/data/.hmac (chmod 600, printed once to stdout), and ships
AGENTMEMORY_REQUIRE_HTTPS=1 by default so the v0.9.12 plaintext-bearer
guard refuses to leak tokens over non-loopback HTTP if a TLS upstream
gets misconfigured.

Only port 3111 (REST API) is exposed publicly. The viewer on 3113
stays bound to localhost inside the container; every README documents
the SSH-tunnel pattern. The main README gains a Deploy section with
three Deploy-to-platform shield buttons linking to the platform's
URL-encoded one-click flow, plus the deploy/ subtree with per-platform
docs covering HMAC capture, rotation, /data backup, viewer access,
cost floor, and known caveats.

* fix(deploy): run as non-root, placeholder app names, drop invalid Render badge

Addresses CodeRabbit findings on PR #361.

Dockerfiles (fly/railway/render): drop `USER root` + RUN chmod — those
left the container running as root after build. Replaced with
`COPY --chmod=0755 --chown=65532:65532` so the entrypoint script lands
with the correct permissions and owner without ever switching off the
distroless image's default nonroot user. Added an explicit
`USER 65532:65532` directive after COPY to make the runtime user
intent-visible to reviewers.

fly/README.md: every command referenced the hardcoded app name
`agentmemory`. Since the global `agentmemory` slug on Fly is almost
certainly taken, the first `fly launch --name agentmemory` fails and
the rest of the volume/log commands point at a non-existent app. Walk
users through setting `APP="agentmemory-$(whoami)"` once at the top and
reference `$APP` everywhere (launch / volumes / logs / proxy / ssh /
backup) — and derive the volume name (`agentmemory_data` style) from
`$APP` so it stays consistent.

railway/README.md: the previous `railway ssh --service agentmemory --
-L 3113:localhost:3113` snippet was invalid — Railway's CLI ssh does
not support port forwarding. Replaced with a quick in-container curl
check plus two valid browser paths: Railway TCP Proxy (the recommended
option, dashboard → Networking → TCP Proxy on container port 3113) and
an in-container sshd over a TCP Proxy if true SSH tunneling is needed.

README.md: removed the "Deploy to Render" badge. Render's deploy URL
requires `render.yaml` at the repository root, which we keep clean.
The `deploy/render/` blueprint stays in-tree for users who set up the
Render Blueprint flow manually — the deploy section now points readers
there explicitly.

Skipped: the entrypoint.sh SECRET-shape regex check. `openssl rand
-hex 32` under `set -eu` already exits non-zero on any failure path
and always emits exactly 64 hex characters when it succeeds — a
post-hoc regex would only fire on a kernel-level surprise we can't
recover from anyway.

* fix(deploy): self-contained Dockerfiles (no phantom base image)

Templates previously did `FROM rohitghumare64/agentmemory:latest`, an
image that does not exist on Docker Hub. agentmemory ships via npm
(`@agentmemory/agentmemory`) and runs against the `iii` engine binary
fetched from the `iii-hq/iii` GitHub release. There is no first-party
agentmemory Docker image yet.

Rewrote all three Dockerfiles to be self-contained against
`node:22-slim`:

- `apt-get` curl / openssl / ca-certificates / tini for the runtime
  prereqs (openssl for the first-boot HMAC, tini for clean PID 1).
- Download the iii binary from
  `github.com/iii-hq/iii/releases/download/iii/v${III_VERSION}/...`
  matching the build host's arch (uname -m).
- `npm install -g @agentmemory/agentmemory@${AGENTMEMORY_VERSION}` with
  `--omit=optional` so the CJK native deps (added in PR #362) don't
  fail the build on platforms without musl/glibc prebuilds.
- `mkdir -p /data && chown node:node /data` so the existing
  first-boot entrypoint can write `/data/.hmac` without root.
- `USER node:node` (UID 1000, the standard non-root user in the node
  image) instead of the made-up `65532:65532` that assumed a
  distroless base.

Both `AGENTMEMORY_VERSION` and `III_VERSION` are `ARG`s so platform
operators can pin a specific release through their dashboard's
build-args UI without editing the Dockerfile.

README updates:

- Top-level `README.md` and `deploy/README.md` no longer claim users
  pull a pre-published image; the wording now matches what each
  template actually does (build from source on the platform's builder).
- Per-platform "Known caveats" sections drop the
  `rohitghumare64/agentmemory:latest` line and replace it with notes
  about build-time, cache reuse, and how to pin via build args.
- `deploy/render/README.md` drops the obsolete `&imgURL=...` Deploy
  Hook query-string trick and replaces it with
  `AGENTMEMORY_VERSION` build-arg guidance.
- `deploy/README.md` switches "Integration clients (Hermes, OpenClaw,
  pi)" → "Integration plugins" to match the scrub convention from the
  earlier PR-body edit.

* feat(deploy): add Coolify self-hosted template

Adds deploy/coolify/ alongside the existing fly / Railway / Render
templates. Coolify (https://coolify.io/self-hosted) is the
self-hosted-on-your-own-VPS option for operators who don't want their
memories on a third-party managed plane.

Same self-contained Dockerfile pattern as the other three (node:22-slim
+ iii binary + npm package, USER node, build-args for version pinning)
plus a docker-compose.yml that Coolify's "Docker Compose" build pack
consumes directly. Compose layer adds a HEALTHCHECK, a named volume
for /data, and json-file log rotation matching the rest of the deploy
surface.

README walks the operator through:

- Coolify dashboard flow (New Application -> Public Repository ->
  Docker Compose build pack -> Base Directory: deploy/coolify).
- Capturing the first-boot HMAC secret from the Logs tab.
- Viewer access via SSH tunnel from the Coolify host (port 3113 stays
  internal by default) and the optional second-domain + basic-auth
  pattern for sharing it.
- HMAC rotation, /data backup paths (Restic / Borg / rsync / Coolify
  Backups), and VPS cost-floor pointers (Hetzner CX22, DO Basic
  Droplet, Vultr).

Top-level deploy/README.md and README.md updated to list the new
template and explain when to pick it (already running a VPS, want a
self-hosted control plane, don't want a third-party host holding your
memories).

* fix(deploy): COPY iii from iiidev/iii image, drop tarball download

Replaces the curl + tar + chmod sequence in all four Dockerfiles
(fly / railway / render / coolify) with a single multi-stage
COPY --from=iiidev/iii:${III_VERSION} /app/iii /usr/local/bin/iii.

Why:

- Drops the supply-chain risk CodeRabbit flagged on PR #361
  (curl-and-extract without checksum verification). iiidev/iii is the
  same image agentmemory's docker-compose.yml already pulls — using
  Docker Hub's content-addressed manifest as the integrity boundary
  is the same trust we already extend on the local Compose path.
- Drops the arch case statement (BuildKit picks the right manifest
  entry automatically from the multi-arch tag).
- Drops 3 of the 4 apt-get packages from the bookkeeping side; curl
  stays for the eventual HEALTHCHECK probe.

The iii binary lives at /app/iii inside the distroless iiidev/iii
image (per github.com/iii-hq/iii engine/Dockerfile), and COPY into
node:22-slim lands it at /usr/local/bin/iii with 755 permissions so
any USER can exec it.

ARG III_VERSION is now declared before the first FROM so it can be
interpolated into the iii-image stage tag. ARG AGENTMEMORY_VERSION
stays in the final stage where the npm install consumes it.

* fix(deploy): rewire iii config + /data perms + platform port binding

Audit against fly.io / Railway / Render / Coolify docs and the
agentmemory CLI source surfaced four bugs that would prevent every
template from actually serving traffic on a managed host. This commit
addresses all four together because they share the same fix surface
(entrypoint + Dockerfile + platform config) and shipping them
independently would leave intermediate states broken.

1. iii config bound 127.0.0.1 + used relative ./data paths.

   The npm-bundled @agentmemory/agentmemory/dist/iii-config.yaml hard-
   codes `host: 127.0.0.1` on iii-http and `file_path: ./data/...` on
   iii-state / iii-stream. agentmemory CLI's findIiiConfig() returns
   the first existing candidate from a 3-item list with the npm bundle
   at position 0, so a sibling iii-config.yaml in cwd does not
   override. On any managed platform the result is the REST API
   binding loopback (router can't reach) and state writing to a
   directory that's not the persistent mount.

   Fix: the entrypoint overwrites the bundled file at boot with a
   container-tuned config (host 0.0.0.0, /data/... absolute paths,
   no iii-exec because the CLI orchestrates the worker over WS).
   This needs the entrypoint to run as root, hence the Dockerfile
   change below.

2. /data volume permissions root:root vs USER node.

   Every managed platform mounts persistent volumes root-owned 0755.
   The earlier Dockerfile ran `RUN mkdir -p /data && chown -R node`
   at build time, but the volume overlay at runtime replaces /data
   with the platform's empty root-owned mountpoint, leaving the
   `node` USER unable to write state_store.db / .hmac.

   Fix: stay USER root through the image, add `gosu` to apt, do the
   chown in entrypoint.sh against the *runtime* /data, then drop to
   the unprivileged `node` user via `exec gosu node:node agentmemory`
   before exec'ing the CLI. Symmetric to the iii-init busybox sidecar
   pattern in the repo-root docker-compose.yml.

3. Render's PORT injection broke the published port.

   Render auto-sets PORT (default 10000) and routes the public proxy
   to that port; the container then must bind 0.0.0.0:$PORT. Our
   container always binds 3111 (Dockerfile EXPOSE + iii-http config),
   so Render's proxy was forwarding to 10000 and getting connection
   refused.

   Fix: render.yaml now sets envVars PORT=3111 to override Render's
   default, plus exposes AGENTMEMORY_VERSION / III_VERSION as envVars
   (Render translates envVars into docker build args automatically).
   Also drops the AGENTMEMORY_REQUIRE_HTTPS / AGENTMEMORY_HMAC_FILE /
   AGENTMEMORY_DATA_DIR entries — server-side code never reads any of
   them; entrypoint reads HMAC_FILE / DATA_DIR via ${VAR:-default}
   so defaults already apply.

4. Coolify compose `ports:` bypassed Traefik.

   Per Coolify docs, `ports: ["3111:3111"]` binds the port directly on
   the host outside of the proxy network. The deploy would have been
   reachable on `http://<host>:3111` with no TLS termination and no
   domain routing.

   Fix: replaced `ports:` with `expose:` so the port is only
   reachable on the internal proxy network. Operator now sets the
   service domain in the Coolify UI as `https://<fqdn>:3111` (the
   `:3111` is Coolify's proxy hint, not a public port). Also added
   `SERVICE_FQDN_AGENTMEMORY_3111` to the environment so the
   container can know its public URL via Coolify's magic-env
   convention.

Other minor fixes:

- railway.json adds `requiredMountPath: /data` so Railway fails fast
  if the operator forgets to attach a volume.
- fly.toml drops the dead-letter [env] block (server code does not
  read AGENTMEMORY_REQUIRE_HTTPS / AGENTMEMORY_HMAC_FILE /
  AGENTMEMORY_DATA_DIR; entrypoint defaults handle the latter two).
- All 4 entrypoints unified — same script, same heredoc, copy with
  `cp -p` at build time.
- READMEs updated to reflect the new bound port story, the manual
  Render Blueprint flow (Render's deploy-button auto-detection only
  scans repo-root render.yaml), the Coolify proxy / domain pattern,
  and the gosu privilege drop.

Best-practice audit results (fly.toml, railway.json, render.yaml,
docker-compose.yml) verified against current platform docs — fly.toml
schema is clean, railway.json gets the requiredMountPath addition,
render.yaml + coolify compose get the rewires above.

* fix(deploy): pin iii-sdk to 0.11.2 via npm overrides

agentmemory@0.9.12 declares `iii-sdk: ^0.11.2`, which `npm install`
caret-resolves to **0.11.6** as of writing. That bumps the worker
SDK ahead of the engine the agentmemory repo pins (iii v0.11.2).
The two versions are wire-compatible in the calls agentmemory uses
today, but the policy intent in the repo is single-version lockstep
until the v0.11.6 sandbox-everything model is wired through — running
SDK 0.11.6 against engine 0.11.2 in production silently re-introduces
the very drift the AGENTMEMORY_III_VERSION pin exists to prevent.

`npm install -g` ignores the `overrides` field, so we cannot fix this
with the existing global install. Switch each Dockerfile to a local
install under /opt/agentmemory with a one-line package.json carrying:

    overrides: { iii-sdk: 0.11.2 }

then symlink the produced `node_modules/.bin/agentmemory` into
`/usr/local/bin/agentmemory` so the entrypoint invocation stays
identical. Verified locally that this resolves
`node_modules/iii-sdk/package.json` to version 0.11.2 (was 0.11.6
without the override).

Side effects:
- Entrypoint III_CONFIG path moves from
  `/usr/local/lib/node_modules/@agentmemory/agentmemory/dist/iii-config.yaml`
  (global) to
  `/opt/agentmemory/node_modules/@agentmemory/agentmemory/dist/iii-config.yaml`
  (local). agentmemory CLI's findIiiConfig() resolves __dirname
  through the symlink to the local install so candidate-0 lookup
  still hits the file we overwrite.
- New ARG `III_SDK_VERSION=0.11.2` declared in every Dockerfile,
  surfaced as a `build.args` field in coolify's compose and as an
  envVar in render.yaml so operators can pin a different SDK version
  if they manually migrate to engine 0.11.6+ down the line.
- AGENTMEMORY_III_VERSION env now baked into the image (matches the
  engine pin) so the CLI's iii-binary download fallback path also
  resolves to the right version if PATH lookup ever misses.

* fix(deploy): set TINI_SUBREAPER=1 so tini reaps zombies on Fly

End-to-end deploy to fly.io surfaced the warning:

    [WARN  tini (645)] Tini is not running as PID 1 and isn't
    registered as a child subreaper. Zombie processes will not be
    re-parented to Tini, so zombie reaping won't work.

Fly's init system holds PID 1, so tini lands as a child process and
the agentmemory CLI (which spawns iii detached) ends up with no
process reaping its eventual zombies. Setting TINI_SUBREAPER=1
triggers prctl(PR_SET_CHILD_SUBREAPER, 1) so tini still reaps
descendant zombies even when it isn't PID 1. Same fix applies to
Railway and Render (which also wrap our entrypoint under their own
init) and to Coolify when the operator runs the container under any
PID-1-grabbing supervisor like systemd-nspawn.

Verified end-to-end against fly.io:

- App agentmemory-ghumare64 deployed via remote builder
- 114 MB image (iii engine + node:22-slim + npm packages)
- 1 GB volume in iad provisioned and mounted
- First-boot HMAC generated and persisted to /data/.hmac
- iii-engine + agentmemory worker came up clean
- Worker registered with iii engine over ws://localhost:49134
- /agentmemory/livez returned 200 in 220 ms
- Bearer auth gating verified: 401 without secret, 200/201 with it
- POST /agentmemory/observe ingested a synthetic observation
- POST /agentmemory/smart-search returned a valid response envelope

Only outstanding log noise after this commit is a Node punycode
deprecation warning from a transitive dep — not actionable from this
side.

* fix(deploy): backfill TINI_SUBREAPER=1 on railway + render Dockerfiles

Previous commit missed these two — same reasoning applies (Railway and
Render both wrap our entrypoint under their own platform init, so tini
lands as a non-PID-1 child and zombie reaping needs the prctl
subreaper bit set).

* fix(deploy): apply live-fly-deploy findings across all templates

Real end-to-end deploy on fly.io surfaced four refinements that
generalise across the other three templates. This commit ports them.

1. Health-check grace_period: 10 s -> 30 s.

   Measured cold start on a fresh fly.io machine in `iad`:

       machine image prepared :  5.1 s
       volume mount + format  :  2.5 s
       firecracker boot       :  1.0 s
       entrypoint + chown     :  0.5 s
       iii-engine ready       :  3.0 s
       agentmemory worker reg :  2.0 s
       --------------------------------
       healthcheck passes     : ~9-10 s

   `grace_period = "10s"` (fly.toml) and `start_period: 20s` (coolify
   compose + Dockerfile HEALTHCHECK) were both within the noise of the
   measured time. Bumped both to 30 s — gives a 3x safety margin
   without affecting normal-operation health-check cadence.

   Railway and Render manage their own initial-deploy grace windows
   server-side, so railway.json's `healthcheckTimeout: 30` (per-attempt)
   and render.yaml's `healthCheckPath` stay as-is.

2. Documented LLM + embedding provider env-var menu in deploy/README.md.

   Live deploy logs surface a clear warning:

       [agentmemory] No LLM provider key found
       (ANTHROPIC_API_KEY, GEMINI_API_KEY, OPENROUTER_API_KEY,
        MINIMAX_API_KEY). LLM-backed compression and summarization are
       DISABLED -- using no-op provider.

   The deploy READMEs never told operators how to opt in. Added a
   table covering ANTHROPIC / GEMINI / OPENROUTER for LLM,
   OPENAI / VOYAGE for embeddings, plus AGENTMEMORY_AUTO_COMPRESS and
   AGENTMEMORY_INJECT_CONTEXT for the corresponding behaviour flags.
   Each platform's variable-injection surface (flyctl secrets,
   dashboard Environment tab) is named in the same table.

3. Captured cold-start budget in deploy/README.md.

   Same step-by-step measured timing above is now in the shared
   README so operators have a concrete expectation rather than
   guessing why first-byte after `fly deploy` takes ten seconds. Notes
   that every template's start window is sized for 3x of this.

4. fly README: documented shared-IPv4 + dedicated-IPv6 default.

   Fly assigns one of each by default at no charge. Legacy clients
   without SNI that need a dedicated IPv4 can buy one for $2/month
   via `fly ips allocate-v4`. Added the build-time stats observed in
   the real deploy (~30 s first build, ~10 s cached rebuild, 114 MB
   image) to the Known Caveats section so operators sizing
   build-minute budgets have real numbers.

No code paths changed -- only TOML / YAML / Markdown surface tweaks
and one HEALTHCHECK `start_period` bump in the coolify Dockerfile.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

CJK tokenizer for BM25 (Chinese / Japanese / Korean segmentation)

1 participant