feat: add opt-in model router catalog picker filter by BenKalsky · Pull Request #1 · Digitizers/hermes-agent

BenKalsky · 2026-05-06T16:32:02Z

Summary

Adds an opt-in /model picker catalog filter for Hermes, driven by the shared model-router catalog.

When HERMES_MODEL_ROUTER_CATALOG points to a shared catalog JSON, Hermes filters picker rows to the curated model-router catalog. Without the env var, behavior remains unchanged.

Safety

Opt-in only via HERMES_MODEL_ROUTER_CATALOG
No default behavior change
No live routing/enforcement change
Keeps provider switch callback slugs compatible with Hermes runtime

Verification

venv/bin/python -m pytest \
  tests/hermes_cli/test_model_router_picker_catalog.py \
  tests/hermes_cli/test_user_providers_model_switch.py \
  -q -o 'addopts='

Result: 32 passed

Additional smoke with real shared catalog:

With HERMES_MODEL_ROUTER_CATALOG=/home/gidon/shared-agents/model-router/catalog.json: catalog_rows=9
Without env var: catalog_rows=0

chatgpt-codex-connector · 2026-05-06T16:38:52Z

💡 Codex Review

https://github.com/Digitizers/hermes-agent/blob/e12e8d6feaa219be1984ee2faa509e0e1338676a/hermes_cli/model_switch.py#L1041-L1042
Map common custom Ollama rows to the catalog

When HERMES_MODEL_ROUTER_CATALOG is set and the user has the common local Ollama custom providers named like Ollama — GLM 5.1, section 4 groups them as slug custom:ollama with display name Ollama (see existing grouping tests). This mapping only recognizes custom:ollama-local / Ollama Local, so those rows do not match the catalog's ollama entry and _align_picker_results_with_model_router_catalog() treats them as unmatched user-defined providers, leaving the uncurated local model list visible in the picker despite the opt-in catalog filter.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

BenKalsky · 2026-05-06T17:08:52Z

Status update after follow-up push:

Added AUTHOR_MAP entry for ben@digitizer.co.il -> BenKalsky in scripts/release.py.
Local attribution check now passes against digitizers/main...HEAD.
Local focused picker tests pass:
venv/bin/python -m pytest tests/hermes_cli/test_model_router_picker_catalog.py tests/hermes_cli/test_user_providers_model_switch.py -q -o addopts=
Result: 32 passed.
git diff --check passes.
CI rerun: attribution, supply-chain, e2e, and both nix jobs pass.

Remaining red check: full test job still fails with 33 unrelated upstream-suite failures across Bedrock/context, cron prompt, Discord/Telegram gateway tests, update tests, browser chromium checks, delegate tests, dockerfile tests, env sandbox tests, etc. The PR diff is limited to:

hermes_cli/model_switch.py
tests/hermes_cli/test_model_router_picker_catalog.py
scripts/release.py

I do not recommend bundling broad upstream test-suite fixes into this focused picker PR.

Three issues flagged by the Copilot review on this PR: 1. Double JSON emit on stage failure (Copilot #1, NousResearch#2). When -Stage <name> ran a worker that threw, Invoke-Stage's finally emitted a JSON result frame AND the entry-point catch emitted a second error frame -- producing two concatenated JSON objects on stdout and breaking the one-line-per-invocation contract that drivers parse against. Same issue applied to -Json mode on a full install (every stage's finally plus a final error frame missing duration_ms/skipped). Fix: Invoke-Stage's finally now sets $script:_StageEmittedErrorFrame when it emits a failure frame; the entry-point catch checks the flag and skips its own emit, still exit 1. 2. $prevEAP uninitialized on early try-block throw (Copilot NousResearch#3). In Install-Uv, Test-Python, Test-Node's winget fallback, _Run-NpmInstall, and the playwright block, '$prevEAP = $ErrorActionPreference' lived as the first statement INSIDE the try. If anything between 'try {' and that line threw (Write-Info on an unusual host, the npx-finding loop, etc.), the catch's 'if ($prevEAP) { ... }' restore was a no-op and EAP could remain relaxed. Fix: hoist '$prevEAP = $ErrorActionPreference' to the line immediately before 'try {' in all five sites. Catch's restore is now always meaningful regardless of where in the try the throw originated. No change to Invoke-Stage's success path or to the four lint-clean EAP sites (Test-Node was the only winget-related catch). All 19 metadata smoke tests still pass.

Four findings from Copilot's review on PR NousResearch#22891, all in the AX elements-array cap added by 22fa1ed: 1. The truncation note ("response truncated to N of M elements") was appended unconditionally — including in the som/vision multimodal path, whose response carries a screenshot rather than an `elements` array. The note described a payload field that wasn't present. Moved the note into the AX-text branch where the array actually appears. 2. `_format_elements(cap.elements)` ran on the full untrimmed list with its own `max_lines=40` cap, so a caller passing `max_elements=10` would see summary lines referencing `NousResearch#11..NousResearch#40` even though the JSON `elements` array only held #1..NousResearch#10. Format on `visible_elements` instead so the summary indices always exist in the response. 3. `_coerce_max_elements` enforced a lower bound but no upper bound, so `max_elements=10_000_000` silently disabled the safeguard and reintroduced the original context-blow-up. Added a hard cap (`_MAX_ALLOWED_MAX_ELEMENTS = 1000`) that clamps oversized values. 4. The schema string said "Default 100" but the property carried no `default` field, and claimed `max_elements` had no effect on som/ vision while the image-missing fallback path can still return an elements array. Added `"default": 100`, `"maximum": 1000`, and clarified the fallback-path wording. Each finding gets a regression test: - test_capture_ax_clamps_oversized_max_elements_to_hard_cap - test_capture_ax_summary_indices_match_returned_elements - test_capture_multimodal_summary_omits_truncation_note - test_schema_max_elements_documents_default_and_upper_bound Verified with `pytest tests/tools/test_computer_use.py` (53 passed, including the 5 new cases). Confirmed each new test fails on the pre-fix code path before applying the production change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

feat: add opt-in model router catalog picker filter

d06f7ae

BenKalsky force-pushed the ben/model-router-picker-catalog-opt-in-20260506 branch from e12e8d6 to d06f7ae Compare May 6, 2026 16:34

chore: map Digitizer author email

0d8ccd7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add opt-in model router catalog picker filter#1

feat: add opt-in model router catalog picker filter#1
BenKalsky wants to merge 2 commits into
mainfrom
ben/model-router-picker-catalog-opt-in-20260506

BenKalsky commented May 6, 2026

Uh oh!

chatgpt-codex-connector Bot commented May 6, 2026

Uh oh!

BenKalsky commented May 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

BenKalsky commented May 6, 2026

Summary

Safety

Verification

Uh oh!

chatgpt-codex-connector Bot commented May 6, 2026

💡 Codex Review

Uh oh!

BenKalsky commented May 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant