fix(evaluation): map local sequential ids to global conv ids in BM25/embedding index filenames (#127) by Fearvox · Pull Request #242 · EverMind-AI/EverOS

Fearvox · 2026-06-03T02:41:09Z

What

When the evaluation pipeline is run on a slice (--from-conv 5 --to-conv 10), retrieval silently returned empty results. Root cause is a three-way filename-keying mismatch in the EverCore eval adapter:

Stage 1 (memcell writer) keys files by the global conversation id extracted from conversation_id via _extract_conv_index ("locomo_5" -> "5"), writing memcell_list_conv_5..9.json.
Search stage (reader) also keys by the global id, reading bm25_index_conv_5..9.pkl / embedding_index_conv_5..9.pkl.
Stage 2 (index builder) iterated a local range(config.num_conv) counter (0..N-1), so for a 5-conversation slice it looked for memcell_list_conv_0..4.json (absent → "File not found, skipping"), built nothing, and wrote indexes under the wrong names.

Net effect: a sliced run builds zero usable indexes and every search returns empty. This is exactly the symptom reported in #127 ([BUG] evaluation: BM25/Embedding index filenames mismatch when running with --from-conv/--to-conv, causing empty retrieval).

This PR aligns stage 2 (and the adapter's skip-logic probe) to the same global-id key:

stage2_index_building.build_bm25_index / build_emb_index now iterate the global conv ids actually in play. The adapter passes the exact slice via a new conv_ids argument; a discover_conv_ids() helper derives ids from the memcell_list_conv_*.json filenames as a fallback for the standalone CLI entry point (main()).
evermemos_adapter._check_missing_indexes now probes by global conv ids instead of range(num_conv), so the smart-skip logic no longer always reports a slice's indexes as missing and rebuilds them under the wrong filenames.

Why

range(num_conv) is a silent correctness bug for any non-full run: the slice's conversations keep their global ids, but the builder assumed a contiguous 0..N-1 local space. The full dataset (--from-conv 0) happened to work only because local and global ids coincide there, which is why it went unnoticed.

Tests

Adds methods/EverCore/tests/test_evaluation_index_filename_conv_ids.py, which reproduces the slice fully offline — no docker, no LLM, no network (BM25 path only; the bug is a filename-mapping bug):

test_bm25_index_filenames_match_global_conv_ids — writes memcell_list_conv_5/6/7.json (a slice) with local num_conv=3, builds indexes, and asserts the produced files are bm25_index_conv_5/6/7.pkl (what search reads) and that the buggy conv_0/1/2.pkl are absent.
test_built_filename_matches_search_lookup_key — asserts the filename the search stage computes via _extract_conv_index exists on disk after the build.
test_check_missing_indexes_uses_global_conv_ids — asserts the skip-logic probe keyed by global ids reports nothing missing, while the old local-range keying would wrongly report ["0","1","2"].
test_discover_conv_ids_* — the disk-discovery helper returns the global ids and sorts them numerically (2 before 10).
test_real_adapter_check_missing_indexes_keys_by_global_id — exercises the real EverCoreAdapter method when importable.

Run locally:

cd methods/EverCore
uv sync
PYTHONPATH=src pytest tests/test_evaluation_index_filename_conv_ids.py -v

Result in this build: 6 passed, 1 skipped.

Skip note: the one skipped test imports the full EverCoreAdapter, whose chain currently hits an unrelated, pre-existing import break on main — stage1_memcells_extraction.py imports ScenarioType from memory_layer.profile_manager, which no longer exports it. That break exists on clean origin/main and is not introduced by this PR; the test skips cleanly and auto-enables once that unrelated import is fixed. The core stage-2 reproduction does not touch that chain and runs fully.

Before / after (mechanism)

files on disk (stage1, global ids): memcell_list_conv_5/6/7.json
OLD stage2 looked for (local range): memcell_list_conv_0/1/2.json
OLD stage2 found: []  -> builds 0 indexes -> search reads conv_5/6/7.pkl (never written) -> EMPTY
NEW stage2 builds:  bm25_index_conv_5/6/7.pkl  == what search reads -> non-empty

Scope

Surgical: two source files (evaluation/src/adapters/evermemos/stage2_index_building.py, evaluation/src/adapters/evermemos_adapter.py) plus one new regression test. No schema, no repository-layer, no docker-compose, no docs churn — deliberately narrower than the broad 21-file prior-art PR (#136).

Closes #127.

Credit

Prior community work diagnosed the same --from-conv/--to-conv filename mismatch. Co-author preserved on the commit:

Jah-yee (RoomWithOutRoof) — fix(evaluation): resolve BM25/Embedding index filename mismatch when using --from-conv/--to-conv #136 fix(evaluation): resolve BM25/Embedding index filename mismatch when using --from-conv/--to-conv (same root cause, broad 21-file scope) and fix: rename stage3_memory_retrivel.py to stage3_memory_retrieval #115 fix: rename stage3_memory_retrivel.py to stage3_memory_retrieval (tangential stage-3 rename).

Co-authored-by: Jah-yee 166608075+Jah-yee@users.noreply.github.com
Co-Authored-By: Claude Opus 4.8 (1M context) noreply@anthropic.com

🤖 Generated with Claude Code

Use self-deployed embedding and rerank APIs by default See merge request npc-work/aic/ai/evermemos-opensource!64

…ader info

vLLM Rerank API adopts an instruction-tuned approach See merge request npc-work/aic/ai/evermemos-opensource!65

feat: metrics client and rerank/vectorize/retrieve metrics See merge request npc-work/aic/ai/evermemos-opensource!66

fix:update episode prompt See merge request npc-work/aic/ai/evermemos-opensource!68

feat: add rerank metrics See merge request npc-work/aic/ai/evermemos-opensource!69

… into dev

* chore: rename project from evermemos to EverCore This commit renames the project directory and updates all internal references from "evermemos" to "EverCore". The changes include: - Renaming the main directory from `methods/evermemos` to `methods/EverCore` - Updating all import paths and module references - Maintaining the same code structure and functionality - Adding new configuration files (.vscode/settings.json, .pylintrc, pyrightconfig.json) - Updating Dockerfile and project metadata * docs: update references from evermemos to EverCore Update documentation files to reflect the renaming of the 'evermemos' directory to 'EverCore'. This includes fixing clone commands, directory paths, and documentation links across multiple files to ensure consistency and correct navigation for users. * chore: rename EverMemOS to EverCore across codebase This is a project-wide rebranding from EverMemOS to EverCore. The changes include: - Update project name in source files, documentation, and configuration - Rename API references, environment variables, and service names - Modify demo descriptions and benchmark configurations - Update URLs and citations to reflect new project identity All functionality remains identical; only naming has changed to align with the new project branding. * docs: update README with EverCore focus and restructured TOC - Add line break before Table of Contents for better visual separation - Rewrite project description to highlight EverCore as the central component - Reorder directory tree to prioritize benchmarks and methods over use-cases - Update use-cases list with more examples and clarify they are templates - Improve flow from Quick Start to use-cases to benchmarks * docs: update README with clearer methods description and benchmarks Add benchmark numbers directly in the method summaries for better visibility. Clarify introductory text to emphasize choice and composition of methods. * docs: fix markdown formatting in README table of contents Adjust whitespace and line breaks to ensure proper rendering of the collapsible table of contents section.

…d-AI#204) - Replace specific EverMemBench-Dynamic badge with general EverMind-AI HuggingFace badge - Remove redundant License badge - Change "Methods" section heading to "Architecture Methods" - Update sub-section headings from h4 (####) to h3 (###) for better hierarchy

…rMind-AI#208) * docs: restructure README and add AGENTS.md for better navigation - Reorder sections to emphasize architecture methods and use cases - Move use cases section before quick start for better flow - Rename "Methods" to "Architecture Methods" for clarity - Add AGENTS.md with quick commands and key entry points - Update section headers to improve document hierarchy - Maintain all existing content while improving organization * docs: add community and contribution files * docs: reorder README directory tree for logical grouping * docs: move community files to .github/ and update references * ci: change deploy workflow trigger from feature branch to main

* docs: restructure README and add subdirectory guides Move the directory tree from the main README to new dedicated README files for each top-level folder (use-cases, methods, benchmarks). Add detailed introductions and tables to guide users to the appropriate subprojects. This improves navigation and provides clear entry points for different use cases. * docs: expand showcase section with new projects and links Add six new project entries to the README showcase, each with a banner image, description, and code/plugin link. Also update an existing benchmark entry to include a dataset link. This enhances the repository's demonstration of real-world applications and available resources.

* docs(readme): update project links and formatting * docs(use-cases): enhance README with visual catalogue of demos Expand the use cases section from a simple table to a detailed visual catalogue with project banners, descriptions, and links. This improves user engagement and provides a better showcase of community integrations and demos. * docs: update READMEs and add validation for use-case links

…#215)

* docs: update plugin repository link in README * docs(readme): update banner gif link

) * docs(readme): update code example link to pinned commit pin the reference to the voice assistant example code to a specific commit hash and fix folder name capitalization * docs: update voice assistant demo link in README

* docs(readme): add four new use case entries * docs(readme): update outdated banner links to correct github repos

…e-demo-content-payload Fix EverCore demo memory payload

…actions-hygiene Harden GitHub Actions workflows

…bot-config Add Dependabot configuration

…adme-quickstart docs: verify EverCore quickstart path

…I#236) Delete deprecated EvoAgentBench, EverMemBench benchmark suites and HyperMem memory system implementation, including all associated configurations, scripts, and supporting assets.

…nd-AI#127) The eval index-building stage (stage 2) wrote BM25/Embedding index files keyed by a local range(num_conv) counter (conv_0..conv_{N-1}), while stage 1 (memcell writer) and the search stage (reader) both key off the GLOBAL conversation id extracted from conversation_id via _extract_conv_index (e.g. "locomo_5" -> "5"). On a sliced run (--from-conv 5 --to-conv 10) the sliced conversations keep their global ids, so stage 2 looked for memcell_list_conv_0..4.json (absent -> skipped), built nothing, and the search stage then failed to find bm25_index_conv_5..9.pkl, yielding empty retrieval. Reported in EverMind-AI#127. Fix: - stage2_index_building: build_bm25_index / build_emb_index iterate the global conv ids actually in play. The adapter passes the exact slice via a new conv_ids arg; a discover_conv_ids() helper derives ids from the memcell_list_conv_*.json filenames as a fallback for the CLI entry point. - evermemos_adapter: _check_missing_indexes now probes by global conv ids (not range(num_conv)), so the skip-logic no longer always reports a slice's indexes as missing and rebuilds them under the wrong filenames. Adds an offline regression test (no docker / LLM / network) that reproduces the slice and asserts the built index filenames are exactly the ones the search stage reads. Prior art on the same mismatch by Jah-yee (EverMind-AI#136, broad 21-file fix; #115, tangential stage3 rename); this change keeps the fix surgical. Co-authored-by: Jah-yee <166608075+Jah-yee@users.noreply.github.com> Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Copilot

Copilot was unable to review this pull request because the user who requested the review has reached their quota limit.

libin.zhang and others added 30 commits December 31, 2025 18:23

fix: change uv lock

ff78815

Merge branch 'feature/selfdeploy-embedding' into 'dev'

529fd80

Use self-deployed embedding and rerank APIs by default See merge request npc-work/aic/ai/evermemos-opensource!64

🐛 conversation data bug fix & request log refactor & remove tenant he…

aae4364

…ader info

🐛 fix rerank interface

80dbd5d

test: Add rerank integration tests

791ec26

🐛 fix es connection register

64f9081

🐛 fix es connection register

44fbcfb

fix: vLLM Rerank API adopts an instruction-tuned approach

81b8b61

fix: vLLM Rerank API adopts an instruction-tuned approach

c12612c

⚡ redis connection pool size

64e5d3e

Merge branch 'feature/selfdeploy-embedding' into 'dev'

1de53cf

vLLM Rerank API adopts an instruction-tuned approach See merge request npc-work/aic/ai/evermemos-opensource!65

⚰️ remove longjob manager

ec5e604

feat: metrics client and rerank/vectorize/retrieve metrics

b3958cb

✨ fetch add group_ig

d792644

✨ retrieve support query_all

eed2a17

refactor: Move the metrics module to the observation module

a17f3bc

fix: remove _metrics_server_thread

b1287e9

fix: fix _normalize_path bug

ade479d

Merge branch 'feature/metrics' into 'dev'

7b44ca6

feat: metrics client and rerank/vectorize/retrieve metrics See merge request npc-work/aic/ai/evermemos-opensource!66

✨ support pending messages

5362bc6

🐛 fix env in docs

70ce75e

✨ support delete memories

4d699e0

🐛 fix print

6d94e66

📝 add delete memory docs

9982264

fix:update episode prompt

18c1881

Merge branch 'fix/update_episode_prompt' into 'dev'

100441e

fix:update episode prompt See merge request npc-work/aic/ai/evermemos-opensource!68

feat: add rerank metrics

fc6eb70

Merge branch 'feature/rerank-metrics' into 'dev'

0c34a58

feat: add rerank metrics See merge request npc-work/aic/ai/evermemos-opensource!69

refactor: extract foresight & eventlog from conversation

3cc9fa5

Merge branch 'dev' of gitlab.com:npc-work/aic/ai/evermemos-opensource…

cd29378

… into dev

cyfyifanchen and others added 25 commits April 27, 2026 03:42

docs: comment out broken GIF links in README (EverMind-AI#209)

6dda863

docs: uncomment banner images in use cases section (EverMind-AI#210)

fa8d77e

docs(readme): add two new AI assistant use case sections (EverMind-AI…

29d555c

…#215)

docs: update plugin repository link in README (EverMind-AI#221)

2ac322f

docs: fixing the links (EverMind-AI#222)

9909f1c

* docs: update plugin repository link in README * docs(readme): update banner gif link

docs(readme): add four new use case entries (EverMind-AI#226)

c4df799

docs(readme): fix links in README (EverMind-AI#227)

86cd6e9

* docs(readme): add four new use case entries * docs(readme): update outdated banner links to correct github repos

Fix EverCore demo memory payload

b8a2093

Merge pull request EverMind-AI#228 from EverMind-AI/codex/fix-evercor…

dbfe822

…e-demo-content-payload Fix EverCore demo memory payload

Harden GitHub Actions workflows

d201b0c

Add EverCore smoke workflow

6669825

Pin setup-uv action version

a51e858

Merge pull request EverMind-AI#229 from EverMind-AI/codex/fix-github-…

a247c38

…actions-hygiene Harden GitHub Actions workflows

Add Dependabot configuration

c5740dd

Merge pull request EverMind-AI#230 from EverMind-AI/codex/add-dependa…

5911acc

…bot-config Add Dependabot configuration

docs: verify EverCore quickstart path

f29ff15

Merge pull request EverMind-AI#232 from EverMind-AI/codex/evercore-re…

0f14d05

…adme-quickstart docs: verify EverCore quickstart path

chore: remove unused benchmark suites and HyperMem method (EverMind-A…

afb8fab

…I#236) Delete deprecated EvoAgentBench, EverMemBench benchmark suites and HyperMem memory system implementation, including all associated configurations, scripts, and supporting assets.

Copilot AI review requested due to automatic review settings June 3, 2026 02:41

Copilot AI reviewed Jun 3, 2026

View reviewed changes

github-actions Bot mentioned this pull request Jun 3, 2026

[watch] Overnight fork patrol: 2026-06-03 Fearvox/EverOS#108

Open

cyfyifanchen closed this Jun 6, 2026

cyfyifanchen force-pushed the main branch from 773e19b to 518b8ec Compare June 6, 2026 00:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(evaluation): map local sequential ids to global conv ids in BM25/embedding index filenames (#127)#242

fix(evaluation): map local sequential ids to global conv ids in BM25/embedding index filenames (#127)#242
Fearvox wants to merge 668 commits into
EverMind-AI:mainfrom
Fearvox:fix/issue-127-index-filename-conv-ids

Fearvox commented Jun 3, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

17 participants

Conversation

Fearvox commented Jun 3, 2026

What

Why

Tests

Before / after (mechanism)

Scope

Credit

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

17 participants