Skip to content

feat: implement EMBEDDING_DIM=0 auto-infer and add dim mismatch detection#53

Open
allengaoo wants to merge 4 commits intomatrixorigin:mainfrom
allengaoo:feat/embedding-dim-auto-infer
Open

feat: implement EMBEDDING_DIM=0 auto-infer and add dim mismatch detection#53
allengaoo wants to merge 4 commits intomatrixorigin:mainfrom
allengaoo:feat/embedding-dim-auto-infer

Conversation

@allengaoo
Copy link

Problem

The documentation has long promised that EMBEDDING_DIM=0 auto-infers the embedding dimension from the configured service — but this was never implemented. The actual behavior was:

  • Config::from_env() defaulted to 1024, silently ignoring the documented 0 = auto semantics
  • Setting EMBEDDING_DIM=0 explicitly would have produced a vecf32(0) SQL error on first startup
  • When switching embedding models (e.g. from BAAI/bge-m3 @ 1024d to nomic-embed-text @ 768d), Memoria would start without error but silently fail on every INSERT because the existing schema dimension didn't match

Changes

memoria-service/src/config.rs

  • Default embedding_dim changed from 10240 (auto-infer, as documented)

memoria-cli/src/main.rs

  • New probe_embedding_dim() async helper: builds a temporary HttpEmbedder, calls embed("dimension probe"), and returns vec.len() as the actual dimension
  • Both cmd_serve and cmd_mcp (embedded mode) call the probe when cfg.embedding_dim == 0 && cfg.has_embedding()
  • cfg.embedding_dim is updated before SqlMemoryStore::connect() so the schema is created with the correct dimension
  • Clear, actionable error message when the probe fails (e.g. embedding service unreachable at boot)

memoria-storage/src/store.rs

  • New check_embedding_dim_compat() method: queries information_schema.columns for the actual vecf32(N) type of the embedding column and compares it against self.embedding_dim
  • Returns a descriptive MemoriaError::Internal when a mismatch is detected — fast-fails at startup instead of silently producing wrong results on every write
  • Called after migrate() in both cmd_serve and cmd_mcp

README.md + skills/deployment/SKILL.md

  • Updated EMBEDDING_DIM description to reflect that 0 now actually performs the probe
  • Removed the "CRITICAL: configure embedding BEFORE first start" warning that was masking the missing feature

User impact

Before this PR, Ollama users (e.g. nomic-embed-text, 768d) had to:

  1. Know the exact dimension of their model
  2. Set EMBEDDING_DIM=768 manually
  3. If they forgot, face a confusing SQL error or silent write failures

After this PR:

  • Leave EMBEDDING_DIM unset (or =0) → Memoria probes on startup, schema is created correctly
  • Switching models: Memoria now immediately reports a clear mismatch error instead of silently corrupting writes

Testing

Verified manually against a local MatrixOne instance with:

  • nomic-embed-text via Ollama (http://localhost:11434/v1, 768d auto-inferred)
  • Mismatch detection: restarting with a different dim produces a clear error message

CI build will verify compilation across the full workspace.

…tion

The documentation has long promised that EMBEDDING_DIM=0 would
auto-infer the embedding dimension from the configured service, but
this was never actually implemented — the default silently fell back
to 1024, and EMBEDDING_DIM=0 would have caused a vecf32(0) SQL error.

This commit delivers the promised behaviour:

**Auto-infer (EMBEDDING_DIM=0, new default)**
- `probe_embedding_dim()` builds a temporary HttpEmbedder, calls
  `embed("dimension probe")`, and returns `vec.len()` as the actual
  dimension before any database schema is created or validated.
- Both `cmd_serve` and `cmd_mcp` (embedded mode) call the probe when
  `cfg.embedding_dim == 0 && cfg.has_embedding()`.
- `cfg.embedding_dim` is updated in-place so `build_embedder()` and
  subsequent code all see the correct value.
- Clear error message when the probe fails, with explicit guidance to
  set EMBEDDING_DIM if the embedding service is unavailable at boot.

**Dimension mismatch detection**
- `SqlMemoryStore::check_embedding_dim_compat()` queries
  `information_schema.columns` for the actual `vecf32(N)` type of the
  `embedding` column and compares it against `self.embedding_dim`.
- Called after `migrate()` in both `cmd_serve` and `cmd_mcp`.
- Returns a descriptive error (rather than silently failing on the
  first INSERT) when the schema dimension differs from the config.

**Config default change**
- `Config::from_env()` now defaults `embedding_dim` to `0` instead of
  `1024`, making auto-infer the out-of-the-box experience.

**Docs**
- README and skills/deployment/SKILL.md updated to reflect that
  EMBEDDING_DIM=0 now actually auto-infers the dimension.

Fixes: EMBEDDING_DIM=0 documented but not implemented.
Closes: silent dimension-mismatch failures when switching embedding models.
@XuPeng-SH XuPeng-SH requested a review from aptend March 19, 2026 07:08
@aptend
Copy link
Contributor

aptend commented Mar 19, 2026

image

Hi Teacher Gao @allengaoo , run cargo check & clippy please ٩(•̤̀ᵕ•̤́๑)ᵒᵏᵎᵎᵎᵎ

Clippy lint `match_result_ok` requires replacing:
  if let Some(x) = expr.parse().ok()
with the idiomatic:
  if let Ok(x) = expr.parse()

This removes the intermediate Option conversion and makes
the intent clearer.
Runs cargo check + clippy on every push to feat/**, fix/**, chore/**
so issues are caught before opening a PR to upstream.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants