feat: add opt-in model router catalog picker filter#1
Conversation
e12e8d6 to
d06f7ae
Compare
💡 Codex Reviewhttps://github.com/Digitizers/hermes-agent/blob/e12e8d6feaa219be1984ee2faa509e0e1338676a/hermes_cli/model_switch.py#L1041-L1042 When ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
|
Status update after follow-up push:
Remaining red check: full
I do not recommend bundling broad upstream test-suite fixes into this focused picker PR. |
Three issues flagged by the Copilot review on this PR: 1. Double JSON emit on stage failure (Copilot #1, NousResearch#2). When -Stage <name> ran a worker that threw, Invoke-Stage's finally emitted a JSON result frame AND the entry-point catch emitted a second error frame -- producing two concatenated JSON objects on stdout and breaking the one-line-per-invocation contract that drivers parse against. Same issue applied to -Json mode on a full install (every stage's finally plus a final error frame missing duration_ms/skipped). Fix: Invoke-Stage's finally now sets $script:_StageEmittedErrorFrame when it emits a failure frame; the entry-point catch checks the flag and skips its own emit, still exit 1. 2. $prevEAP uninitialized on early try-block throw (Copilot NousResearch#3). In Install-Uv, Test-Python, Test-Node's winget fallback, _Run-NpmInstall, and the playwright block, '$prevEAP = $ErrorActionPreference' lived as the first statement INSIDE the try. If anything between 'try {' and that line threw (Write-Info on an unusual host, the npx-finding loop, etc.), the catch's 'if ($prevEAP) { ... }' restore was a no-op and EAP could remain relaxed. Fix: hoist '$prevEAP = $ErrorActionPreference' to the line immediately before 'try {' in all five sites. Catch's restore is now always meaningful regardless of where in the try the throw originated. No change to Invoke-Stage's success path or to the four lint-clean EAP sites (Test-Node was the only winget-related catch). All 19 metadata smoke tests still pass.
Four findings from Copilot's review on PR NousResearch#22891, all in the AX elements-array cap added by 22fa1ed: 1. The truncation note ("response truncated to N of M elements") was appended unconditionally — including in the som/vision multimodal path, whose response carries a screenshot rather than an `elements` array. The note described a payload field that wasn't present. Moved the note into the AX-text branch where the array actually appears. 2. `_format_elements(cap.elements)` ran on the full untrimmed list with its own `max_lines=40` cap, so a caller passing `max_elements=10` would see summary lines referencing `NousResearch#11..NousResearch#40` even though the JSON `elements` array only held #1..NousResearch#10. Format on `visible_elements` instead so the summary indices always exist in the response. 3. `_coerce_max_elements` enforced a lower bound but no upper bound, so `max_elements=10_000_000` silently disabled the safeguard and reintroduced the original context-blow-up. Added a hard cap (`_MAX_ALLOWED_MAX_ELEMENTS = 1000`) that clamps oversized values. 4. The schema string said "Default 100" but the property carried no `default` field, and claimed `max_elements` had no effect on som/ vision while the image-missing fallback path can still return an elements array. Added `"default": 100`, `"maximum": 1000`, and clarified the fallback-path wording. Each finding gets a regression test: - test_capture_ax_clamps_oversized_max_elements_to_hard_cap - test_capture_ax_summary_indices_match_returned_elements - test_capture_multimodal_summary_omits_truncation_note - test_schema_max_elements_documents_default_and_upper_bound Verified with `pytest tests/tools/test_computer_use.py` (53 passed, including the 5 new cases). Confirmed each new test fails on the pre-fix code path before applying the production change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
Adds an opt-in
/modelpicker catalog filter for Hermes, driven by the shared model-router catalog.When
HERMES_MODEL_ROUTER_CATALOGpoints to a shared catalog JSON, Hermes filters picker rows to the curated model-router catalog. Without the env var, behavior remains unchanged.Safety
HERMES_MODEL_ROUTER_CATALOGVerification
venv/bin/python -m pytest \ tests/hermes_cli/test_model_router_picker_catalog.py \ tests/hermes_cli/test_user_providers_model_switch.py \ -q -o 'addopts='Result:
32 passedAdditional smoke with real shared catalog:
HERMES_MODEL_ROUTER_CATALOG=/home/gidon/shared-agents/model-router/catalog.json:catalog_rows=9catalog_rows=0