Sync upstream main into apecloud-base#19
Merged
Conversation
…ized response on_post_llm_call extracted usage via `if response is not None:`, taking the response-object path. But post_api_request delivers `response` as a sanitized dict (no `.usage` attribute) alongside a separate `usage` summary dict, so `getattr(response, "usage")` was always None and token/cost data was dropped for every gateway turn (traces showed usage 0 / cost 0). Gate on a real `.usage` attribute so the existing usage-dict fallback is reached. Real response objects (post_llm_call / legacy) still take the response-object path. Adds regression tests for both paths.
NousResearch#41066) _discover_all_plugins() previously did a flat iterdir() scan, missing all category-namespaced plugins (web/*, image_gen/*, browser/*, video_gen/*). Now recurses up to 2 levels deep, matching PluginManager._scan_directory_level(). Also fixes _plugin_status() to check both manifest name AND path-derived key against enabled/disabled sets, so category plugins like 'web/tavily' show correct status when enabled via config.
…not via the throttle (NousResearch#41098) In classic CLI mode the dangerous-command approval prompt (and the clarify, sudo, and secret-capture prompts) could fail to render: the user saw '⏱ Timeout — denying command' after 60s without ever seeing the panel, making approvals.mode: manual unusable. Root cause. These prompts run their wait loop on the agent/background thread: they set modal state that a ConditionalContainer's filter reads, then call self._invalidate() to repaint so the panel appears. _invalidate() is a THROTTLED wrapper built for high-frequency background repaints (spinner frames, streaming) — it (a) returns early while a SIGWINCH resize-recovery is pending, and (b) otherwise only repaints if 250ms elapsed since the last paint. Under either condition the modal's entry paint is silently dropped, the ConditionalContainer never re-evaluates, and the prompt times out unseen. The throttle never belonged on these paths. Originally the callbacks painted with a direct self._app.invalidate() and worked; a throttle PR blanket-replaced every invalidate (including these rare, one-shot, user-blocking modal paints) with the throttled _invalidate(); a later commit removed an idle 1Hz repaint that had been masking dropped modal paints, surfacing the bug. Notably the modal KEY-BINDING handlers (↑/↓/Enter) already paint with a direct event.app.invalidate(), never the throttle — the background-thread callbacks were the inconsistent ones. Fix. Add a small _paint_now() helper that paints directly (guarded for a missing _app, exception-safe) and route the four modal paths' entry, response, countdown, and teardown paints through it — matching the key-handler idiom. This covers approval, clarify, sudo, and the secret-capture teardown (_submit_secret_response, which previously used the throttled _invalidate() so its panel could linger after submit). _invalidate() is left untouched and its docstring now states it is for high-frequency background repaints only; modal/interactive paints must use _paint_now()/_app.invalidate() directly. This also fixes the resize-recovery edge case for free (a direct paint never consults the resize guard) without a throttle-bypass flag that could be cargo-culted onto hot paths. Countdown refresh cadence tightened 5s->1s so the timer stays visible while waiting, and a copy-pasted duplicate countdown block in _clarify_callback is removed. Tests: TestModalPaintNow drives all three wait-loop callbacks on a background thread with BOTH gates active (_resize_recovery_pending=True + a recent _last_invalidate in the throttle window) and asserts the panel paints on entry AND repaints on teardown; plus a secret-teardown test, a direct _paint_now-vs-_invalidate gate test, and a no-_app safety test. Each modal test fails if its paint is reverted to _invalidate(). 17 in-file tests pass; full tests/cli suite green (900). Diagnosis credit: the throttle-drop root cause was identified by @sanidhyasin in NousResearch#41116; @islam666 independently reached the same direct-invalidate approach in NousResearch#41166; original report NousResearch#41098 by @jodonnel.
Clear NeMo Relay plugin-config observability only after the last active Hermes session finalizes. Use the plugin's async-safe awaitable helper for both initialize and clear so session rotation remains safe under active event loops. Disable the direct ATIF fallback when plugins.toml already owns the ATIF exporter lifecycle to avoid duplicate trajectory export on finalization.
… succeeds Signed-off-by: mnajafian-nv <mnajafian@nvidia.com>
…ly_in_thread=false (NousResearch#15421) Top-level Slack channel messages previously fell back to the message's own ``ts`` as a synthetic ``thread_ts``: thread_ts = event.get("thread_ts") or ts # ts fallback for channels That value flows into ``build_source(thread_id=thread_ts)`` at line 1247. The gateway session store keys sessions by ``(platform, channel_id, thread_id)``, so every top-level channel message ended up on a unique session. Operators who set ``reply_in_thread: false`` in ``config.yaml`` expected all top-level channel messages to share one session (the whole point of that flag) — instead each one spawned a fresh conversation with no context carry-over. ### Fix Three explicit cases in the channel branch: | event.thread_ts | reply_in_thread | thread_ts for session keying | |---|---|---| | non-null (real thread reply) | either | event.thread_ts | | null (top-level) | true (default) | ts (legacy: own-thread sessions) | | null (top-level) | false | **None** (shared channel session) | The outbound-reply gate at line 1264 (``reply_to_message_id = thread_ts if thread_ts != ts else None``) still works correctly in all three cases without further changes: ``None != ts`` is True, so shared-channel top-level messages don't get their reply threaded either — matching the operator's ``reply_in_thread=false`` intent end-to-end. Genuine thread replies still scope per-thread under both modes so multi-person threaded conversations can't collide with unrelated channel chatter. ### Tests (7 new in ``tests/gateway/test_slack_channel_session_scope.py``) All drive the real ``SlackAdapter._handle_slack_message`` code path (not a re-implementation) via the standard pytest fixture pattern used by ``tests/gateway/test_slack.py``. Messages @mention the bot so the mention gate doesn't drop them — the tests are specifically about what happens once the handler decides to emit a ``MessageEvent``. * ``TestChannelSessionScopeDefault`` (2 cases): - Explicit ``reply_in_thread: true`` keeps ``thread_id = ts`` (legacy behaviour — regression guard) - Unset config behaves like ``reply_in_thread: true`` (pins the default) * ``TestChannelSessionScopeShared`` (3 cases): - ``reply_in_thread: false`` + top-level → ``thread_id is None`` (the NousResearch#15421 bug 1 fix) - ``reply_to_message_id is None`` in the same case (no threaded outbound reply) - Genuine thread reply still scopes per-thread when shared mode is on — only TOP-LEVEL messages collapse to the channel session * ``TestThreadReplyAlwaysScopesByThread`` (2 parametrised cases): - Thread replies get ``thread_id = event.thread_ts`` regardless of ``reply_in_thread`` — critical invariant for multi-thread channels; a regression here would leak per-thread context across threads **Regression guard verified**: reverted the else-branch to the legacy ``thread_ts = event.get("thread_ts") or ts`` one-liner; ``test_top_level_maps_to_none_when_reply_in_thread_false`` correctly failed (asserts ``thread_id is None`` but got ``"1700000000.000003"``). Restored → 182 slack tests pass (175 existing + 7 new). Scope: this fixes NousResearch#15421 bug 1 only. Bug 2 (sessions.json not persisting across compression) lives elsewhere in the session manager and is left for a separate diff. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ilot NousResearch#15464) Two findings from Copilot's review on NousResearch#15464, both addressed: 1. ``event.get("thread_ts")`` truthy vs ``event_thread_ts != ts``: the new channel branch treated ANY truthy ``thread_ts`` as a real thread reply, but three lines below ``is_thread_reply`` is defined with the stricter ``event_thread_ts and event_thread_ts != ts`` invariant. If Slack ever ships a payload where ``thread_ts == ts`` on a thread root, the stricter check would treat it as a top-level message for the ``is_thread_reply`` path but as a thread reply for session keying — divergent behaviour. Aligned this branch to the same ``and event_thread_ts_raw != ts`` invariant. 2. ``test_top_level_reply_to_id_stays_none_when_shared`` docstring had the ternary logic backwards ("None != ts → reply_to_message_id IS set"). The code reads ``reply_to_message_id = thread_ts if thread_ts != ts else None`` — with ``thread_ts = None``, the condition is True so the expression evaluates to ``thread_ts`` itself (None), meaning the reply stays un-threaded. The test asserted the correct end-state; only the explanatory docstring was wrong. Rewrote the docstring to match the actual code flow, with the note that Copilot caught the reversal. 7/7 tests still pass. No behaviour change for the existing test_thread_reply_scopes_by_thread_even_when_shared case because ``event_thread_ts_raw = "1700000000.000000"`` and ``ts = "1700000000.000005"`` are distinct — the new ``!= ts`` guard is a no-op there. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The custom/Ollama provider profile had no default_max_tokens, so no max_tokens was sent on requests and Ollama fell back to its internal num_predict=128 — truncating responses after a few tokens with finish_reason='length' (NousResearch#39281, e.g. gemma4). max_tokens resolution is ephemeral > user model.max_tokens > profile default, so this is only a floor used when the user hasn't set their own cap. Set it to 65536 (matching the qwen-oauth tier) rather than a conservative value, since users can always override per-model. Fixes NousResearch#39281
…udget When summary_target_ratio is large (e.g. 0.45) and the context_length is moderate (e.g. 96000), the soft_ceiling (token_budget * 1.5) can exceed the total transcript size. _find_tail_cut_by_tokens walks the entire transcript without breaking early, and the resulting compress window is either empty (compress_start >= compress_end) or a single message whose summary-of-one overhead saves ~0 tokens. Both outcomes cause a no-op compression that does not increment _ineffective_compression_count, so should_compress() returns True on every subsequent turn and the loop repeats endlessly. Fix (two layers): 1. _find_tail_cut_by_tokens: when the backward walk consumed the entire transcript without breaking (cut_idx <= head_end and accumulated <= soft_ceiling), re-walk with the raw (non-inflated) token budget to find a meaningful cut that gives the summarizer a useful middle window. 2. compress(): when compress_start >= compress_end, increment _ineffective_compression_count and log a warning so the existing anti-thrashing guard in should_compress() can break the loop. Fixes NousResearch#40803
…ousResearch#41119) On older systemd versions that don't support RestartMaxDelaySec / RestartSteps, the installed unit file has those directives silently dropped. systemd_unit_is_current() did a strict text comparison, so the unit was perpetually flagged as outdated. Fix: _strip_optional_systemd_directives() removes RestartMaxDelaySec and RestartSteps from both the installed and expected text before comparison. Units that differ only by these optional directives are now correctly considered current.
…search#41036) _supports_vision_override() in image_routing.py checked model.supports_vision and providers.<name>.models, but not the legacy list-style custom_providers config. A custom provider entry like: custom_providers: - name: my-provider models: my-model: supports_vision: true was ignored, causing image_input_mode=auto to route through the auxiliary vision_analyze path instead of natively attaching images. Fix: added a lookup step for custom_providers list entries, matching by provider name (including 'custom:<name>' variants at runtime). providers.<name>.models still takes precedence over custom_providers. 13 new tests covering: true/false override, custom: prefix matching, no-match fallback, non-dict entries, empty lists, models key missing.
…ol content (NousResearch#41072) Xiaomi MiMo (and potentially other providers) support multimodal user messages but reject list-type tool message content with 400 'text is not set'. Previously this was handled reactively — the API call would fail, images would be stripped, and the request retried, losing visual info. Fix: add supports_vision_tool_messages field to ProviderProfile (default True). Xiaomi sets it to False. _tool_result_content_for_active_model now checks this field proactively and returns a text summary instead of list content, avoiding the round-trip failure entirely.
…cates When ~/.hermes/profiles/default/ exists as a directory, list_profiles() returns 'default' twice: once as the built-in default profile (~/.hermes) and once from the directory scan (~/.hermes/profiles/default). This causes the cron dashboard API (profile=all) to read the same jobs.json twice, showing every default-profile job duplicated in the UI. Fix: skip name=='default' in the named profiles loop, since it's already added as the built-in default at the top of the function. Fixes NousResearch#39346
…back on custom endpoints Problem: get_model_context_length() had an early return at the end of the custom-endpoint probe branch (step 3) that returned DEFAULT_FALLBACK_CONTEXT (256K) without ever consulting the hardcoded DEFAULT_CONTEXT_LENGTHS catalog (step 8). Models served through a custom/proxied gateway (e.g. corporate Anthropic proxy) that didn't expose Ollama or local-server endpoints would hit this path and get capped at 256K, even when the model name clearly matched a known entry in the catalog (e.g. claude-opus-4-8 → 1M). Changes: - agent/model_metadata.py: Before returning DEFAULT_FALLBACK_CONTEXT at the end of the custom-endpoint branch, consult DEFAULT_CONTEXT_LENGTHS using the same longest-key-first fuzzy matching as step 8. Only fall through to 256K if no catalog entry matches. - tests/agent/test_model_metadata.py: Updated existing test and added new test covering the custom-endpoint → catalog fallback behavior. Fixes NousResearch#38865
…or (NousResearch#38085) The WeChat iLink typing ticket has a 600-second TTL. When a long-running session exceeds that window, the cached ticket evicts from TypingTicketCache. Both send_typing and stop_typing silently returned early when the ticket was None, meaning the TYPING_STOP=2 signal was never sent to iLink. The WeChat client then showed the typing indicator indefinitely. Fix: add _ensure_typing_ticket() that transparently refreshes the ticket via getConfig when the cached one has expired or is missing. Both send_typing and stop_typing now call this method instead of silently no-oping. Fixes NousResearch#38085
The copytree ignore lambda in _copy_dist_payload applied USER_OWNED_EXCLUDE recursively at every directory depth. This caused nested directories whose names matched exclude entries (bin, logs, cache, etc.) to be silently dropped during distribution install/update. Fix: only apply USER_OWNED_EXCLUDE filtering at the root of the staged tree, matching the two-tier pattern used by _clone_all_copytree_ignore and _default_export_ignore in profiles.py. Add 5 tests covering nested bin/logs/cache preservation and top-level filtering still working. Fixes NousResearch#37954
…t cleanup - web_server.py: after proc.poll() returns a non-None exit code, call proc.wait() to reap the child and move the entry from _ACTION_PROCS to _ACTION_RESULTS. Previously .poll() alone left <defunct> zombies. - meet_bot.py: terminate and wait on the pcm_pump subprocess (paplay/ ffmpeg) during the finally-block teardown. Previously leaked on every normal bot exit. - tests: add test_action_status_reaps_completed_process and test_action_status_ignores_wait_failure covering both the happy path and the wait()-raises-OSError edge case. Closes NousResearch#38032
…and cleanup (NousResearch#41691) Inspired by Claude Code's /simplify. A bundled skill that captures recent changes via git diff, fans out three focused reviewers (reuse, quality, efficiency) via delegate_task batch mode, then aggregates findings and applies the fixes worth applying. Zero core changes — orchestrates existing tools (terminal/git, search_files, delegate_task). Supports focus, dry-run, and scoped-diff modifiers. Closes NousResearch#379.
…rvable The dashboard backend serves HTTP 404 on all static routes (/, /assets, /health) in packaged builds because resolveWebDist() points at app.asar.unpacked/dist/, but dist/** was not listed in asarUnpack. Add dist/** to the asarUnpack glob list so electron-builder extracts the built frontend assets alongside the asar archive, making them accessible to the Express static file server at runtime. Fixes NousResearch#41327
…undle is missing (NousResearch#41729) A packaged desktop app launches to a blank page with a bare ERR_FILE_NOT_FOUND when dist/index.html isn't in the bundle (NousResearch#39484). This happens when the build step fails (e.g. a stale checkout that fails typecheck) but electron-builder packages anyway, shipping an empty dist/. - build-time: scripts/assert-dist-built.cjs runs at the tail of the `build` script and aborts before electron-builder if dist/index.html or the vite JS bundle is missing/empty. Every packaging path (pack, dist*) inherits it via `npm run build &&`. - runtime: resolveRendererIndex() now logs a clear 'packaged without a renderer bundle — rebuild with hermes desktop --force-build' message when no index.html exists, instead of silently loading a missing path. - runtime: resolveWebDist() logs when it falls back to an asar-internal dist that isn't a real directory (the dashboard 404 class, NousResearch#41327/NousResearch#39472), rather than returning an unservable path silently. Adds scripts/assert-dist-built.test.cjs (node:test) covering the guard.
…ntity (NousResearch#41730) Subagents delegated to a custom endpoint were misrouted when the parent ran on a different custom endpoint. Both runtimes collapse to provider="custom", so _resolve_child_credential_pool() treated them as interchangeable and handed the child the parent's pool. Leasing from it then overwrote the child's delegated base_url with the parent's endpoint via _swap_credential() — the child sent the delegated model name to the wrong endpoint. Custom runtimes now resolve by endpoint identity (the custom:<name> pool key derived from base_url). The parent pool is reused only when both parent and child resolve to the same custom endpoint; unregistered raw endpoints return None so the child keeps its fixed delegated credential. Non-custom provider paths are unchanged. Fixes NousResearch#7833.
…event cross-turn stale-parent fork (NousResearch#41708) The per-session compression lock prevents same-window concurrent forks but not cross-turn ones: the background-review fork shares the parent's session_id, so if it won a compression race its new child session was never adopted by the gateway (the fork is single-lifecycle). The next foreground turn then started from the stale parent and compressed it again, leaving the same parent with two sibling children. Set review_agent.compression_enabled = False so the fork never triggers compression. Both trigger sites in conversation_loop.py gate on compression_enabled before calling _compress_context, so the fork can never rotate the shared parent. Review needs full context anyway — compressing would degrade the memory/skill summary. The per-session lock is kept as defense-in-depth for any future shared-session path. Adds a regression test that fails without the flag and passes with it. Closes NousResearch#38727
…ousResearch#41764) build_session_key collapsed every DM that arrived without a chat_id into one shared 'agent:main:<platform>:dm' key. A single cached AIAgent then served multiple users' conversations, bleeding history across senders. DMs now fall back to the sender's user_id_alt/user_id (mirroring the group-path participant precedence and the telegram auth-path fallback) before the bare per-platform sink. Telegram's normal event path always sets chat_id, so this hardens the synthetic-source / non-standard-adapter paths that don't.
…ousResearch#41728) The module docstring and get_timezone()/cache comments documented a reset_cache() helper for forcing tz re-resolution after config changes, but the function was never defined — doc-followers calling it hit AttributeError. Adds the helper to clear the cached tz state. Surfaced in NousResearch#32043.
… contamination When a cron or background session compacts, it sets _previous_summary for iterative updates. If that session ends without /new or /reset (which calls on_session_reset()), the stale summary survives on the ContextCompressor instance. A subsequent live messaging session's compaction then injects it as 'PREVIOUS SUMMARY:' into the summarizer prompt — contaminating the live session with unrelated content from the prior session. Add an else guard in compress(): when no handoff summary is found in the current messages but _previous_summary is non-empty, discard it so _generate_summary() starts fresh instead of iteratively updating a stale cross-session summary. Fixes NousResearch#38788
…depth) ContextCompressor inherited a no-op on_session_end() from ContextEngine, so per-session iterative-summary state (_previous_summary) survived a real session boundary on a reused compressor instance. Override it to clear the summary the moment the owning session ends, complementing the point-of-use guard in compress(). Closes the cross-session contamination path in NousResearch#38788. Co-authored-by: dusterbloom <32869278+dusterbloom@users.noreply.github.com>
…Index The desktop code uses Array.prototype.findLast (chat/composer/index.tsx) and findLastIndex (session/hooks/use-session-actions.ts), which are ES2023 APIs, but tsconfig declared only the ES2022 lib. Some TypeScript builds tolerate this, but a correct/stricter tsc fails the desktop build with: TS2550: Property 'findLast' does not exist on type 'ChatMessage[]'. Do you need to change your target library? Try changing 'lib' to 'es2023'. Declare es2023 so the build is correct regardless of the resolved TypeScript version (reported on Windows with Node 24). Refs NousResearch#38970
…ousResearch#42614) hermes update pulls the latest repo, so the freshly-pulled website/static/api/model-catalog.json is already the newest catalog. Copy it straight over ~/.hermes/cache/model_catalog.json instead of relying on a network fetch (which can be Vercel bot-gated or hit a Portal hiccup and silently degrade the picker to a stale/short list). Adds seed_cache_from_checkout() in model_catalog.py (read shipped manifest, validate, atomic write via _write_disk_cache, reset in-process cache) and calls it from both update paths in main.py: _cmd_update_impl (git pull) and _update_via_zip (Docker/no-git). Non-fatal on missing/malformed/invalid files — the normal network refresh still applies on next picker open.
…ocess calls When Hermes runs in TUI mode, the gateway child process communicates with the Node.js parent over a JSON-RPC protocol on stdin. Subprocess calls that inherit this stdin fd can trigger a race condition where the child's stdin read returns EOF, causing the gateway to exit cleanly (exit code 0) mid-tool- execution. This is the same root cause as issue NousResearch#14036 (byterover plugin) and PR NousResearch#39257 (SSH environment backend). This commit applies the fix — stdin=subprocess.DEVNULL — to all 85 subprocess.run() and subprocess.Popen() calls that execute inside the TUI gateway child process. Scope: TUI-context code only (agent/, tools/, plugins/, tui_gateway/server.py). CLI code (cli.py, hermes_cli/), tests, scripts, and gateway process management are excluded — they don't run inside the TUI child and inherit the terminal's stdin, not the JSON-RPC pipe. 85 call sites across 28 files. All files pass syntax check.
scripts/check_subprocess_stdin.py scans agent/, tools/, plugins/, and tui_gateway/ for subprocess.run() and subprocess.Popen() calls that don't explicitly set stdin=. Missing stdin= means the child inherits the parent's fd, which in TUI mode is the JSON-RPC pipe — causing gateway crashes on stdin EOF. Exits 0 (pass) or 1 (violations found). Can be run manually or added to CI. Skips comments, docstring references, and calls that use input= (which creates its own pipe). Usage: python scripts/check_subprocess_stdin.py
Wraps scripts/check_subprocess_stdin.py as a pytest so CI catches regressions when new subprocess calls are added without stdin=.
The blanket DEVNULL pass muzzled run_oauth_setup_token()'s interactive 'claude setup-token' login, which needs inherited stdin to prompt the user. Revert that one call and replace the guard's brittle file:line whitelist with an inline 'noqa: subprocess-stdin' marker that travels with the code.
Regression tests for the salvage follow-up: the interactive 'claude setup-token' login must keep inherited stdin, and the guard's inline 'noqa: subprocess-stdin' marker must exempt a call.
The blanket stdin=subprocess.DEVNULL pass added the kwarg to the docker 'version' preflight call; the test pinned the exact kwargs dict. Update the expected dict to match.
…sResearch#42616) image_generate returns its artifact as JSON ({"image": "/abs/path.png"}) with no MEDIA: tag, so the gateway auto-append path (which only recognized text_to_speech MEDIA: tags) never delivered it — image delivery silently depended on the model restating the path in its reply. Add image_generate to the producer allowlist and extract the local path from its JSON result (host_image > image > agent_visible_image), reusing the existing extension-anchored matcher and history-dedupe so remote URLs, unknown extensions, failures, and already-sent paths are rejected. Closes the remaining unfixed path from NousResearch#19105.
Store operator and assigned iMessage numbers in `auth.json` after setup, and surface them in `hermes photon status`. When numbers are missing, status auto-refreshes from the dashboard without provisioning new lines.
Switch `list_users`, `find_user_by_phone`, `create_user`, `register_user_if_absent`, and `refresh_user_numbers` from the Dashboard API (Bearer token) to the Spectrum API (Basic auth with project credentials). Update response unwrapping to handle the nested `data.users` envelope returned by Spectrum, add `_spectrum_host()` resolver, `_basic()` header helper, and structured error helpers. Update tests, docs, and plugin.yaml accordingly.
Extend the sidecar and Python adapter to handle `voice` content alongside `attachment`. Voice notes are inlined as base64 (same size-cap logic), surfaced as `MessageType.VOICE`, and include an optional `duration` field in fallback markers when bytes are unavailable.
Follow-up to the salvaged fallback-chain fix: - Replace the hand-rolled fallback loader with the shared hermes_cli.fallback_config.get_fallback_chain() helper so the TUI path matches HermesCLI and gateway/run.py exactly: fallback_providers stays first and keeps order, with distinct legacy fallback_model entries merged in after (deduped). Previously the TUI loader picked one key OR the other, diverging from CLI/gateway when both were set. - Update the test to assert the merged canonical semantics. - Add psionic73 to scripts/release.py AUTHOR_MAP (CI gate).
…rt race (NousResearch#42626) The autouse _suppress_concurrent_hermes_gate fixture did monkeypatch.setattr(main, '_detect_concurrent_hermes_instances', ...) with no raising=False. Its try/except guards the import but not the setattr, so under pytest's per-test spawn isolation a transiently partial hermes_cli.main module (one a concurrent worker is mid-importing) made setattr raise AttributeError and errored unrelated tests in the slice. Add raising=False so a transiently-absent attribute is a no-op default rather than a hard error. The attribute always exists once main.py finishes importing; the real-function opt-out (@pytest.mark.real_concurrent_gate) is unaffected.
… list (NousResearch#42629) Two new free-tier slugs surfaced in /model and `hermes model`. owl-alpha was already present. Regenerated website/static/api/model-catalog.json to keep the manifest sync test green.
…ortal failure (NousResearch#42628) The Portal's /api/nous/recommended-models endpoint is the source of truth for which models are free/paid right now, but its result was cached in-process only. When the live fetch failed (network, parse, non-2xx), the function returned {} and the model picker silently dropped the free/paid recommendations — free models would vanish with no indication anything went wrong. Add a per-base disk cache at $HERMES_HOME/cache/nous_recommended_cache.json: a successful live fetch is persisted as last-known-good, and a failed fetch with an empty in-process cache falls back to the disk copy instead of {}. Self-heals on the next successful fetch. With no disk copy, still degrades to {} (callers already handle that). Keyed by portal base URL so staging/prod don't collide. E2E: live fetch writes disk; simulated Portal failure returns the cached free models from disk; no-disk + failure returns {}.
@file: attachments now work when the desktop is connected to a remote gateway. Previously a referenced file resolved to a client-disk path the gateway couldn't see, so context_references rejected it with "path is outside the allowed workspace" and the agent never saw the file. Adds a file.attach RPC (sibling to the existing image.attach_bytes / pdf.attach byte-upload pipeline): the desktop uploads the file bytes, the gateway stages them into <workspace>/.hermes/desktop-attachments/ and returns a workspace-relative @file: ref that resolves cleanly. Local mode passes the path directly; a gateway-visible file outside the workspace is copied in; an in-workspace file is referenced as-is with no copy. Consolidates the file-sync design from NousResearch#38615 (LeonSGP43) and the host-file-staging idea from NousResearch#33455 (Carry00), rebased onto the image/PDF remote-media helpers already on main. Co-authored-by: LeonSGP43 <cine.dreamer.one@gmail.com>
4c5e8c6 to
e4575fb
Compare
# Conflicts: # scripts/release.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
NousResearch/hermes-agent:mainheadf8adefdeinto currentapecloud-baseheade4575fb.apecloud-base.scripts/release.pyby keeping ApeCloud'salal@infracreate.com -> 1aalmapping while adding the new upstream author mappings.Notes
This PR is now an ancestry-preserving upstream sync PR, not the earlier narrow
tools/file_tools.pyfollow-up. It should be merged with a merge commit so GitHub's upstream behind count is cleared.The upstream delta carried a few trailing-whitespace/EOF issues in changed files; those were cleaned in the merge commit so
git diff --check origin/apecloud-base...HEADpasses.Validation
python3 -m py_compile scripts/release.pypython3tomllib.load(open("pyproject.toml", "rb"))uv run ruff check scripts/release.pygit diff --checkgit diff --check origin/apecloud-base...HEAD