fix(hermes): unblock Hermes Agent inference end-to-end (NS-324)#1
Draft
shannonsands wants to merge 3 commits into
Draft
fix(hermes): unblock Hermes Agent inference end-to-end (NS-324)#1shannonsands wants to merge 3 commits into
shannonsands wants to merge 3 commits into
Conversation
… routing All Hermes inference traffic goes through https://inference.local — the OpenShell gateway's virtual provider router that terminates TLS, swaps in the host-side credential for the selected provider, and forwards to the real backend. The existing policy-additions.yaml had no entry for inference.local, so every chat completion was rejected with HTTP 403 'connection not allowed by policy' before reaching OpenShell at all. Direct verification against api.anthropic.com / inference-api.nvidia.com worked for those specific hosts, but the inference.local short-circuit (which is the actual runtime path for every Hermes-to-provider call under NemoClaw) was unreachable. Mirrors the equivalent block in nemoclaw-blueprint/policies/openclaw-sandbox.yaml but with Hermes-side binaries (/usr/local/bin/hermes, /usr/bin/python3, /usr/bin/python3.11, plus curl for diagnostics). Verified via end-to-end Anthropic chat completion: real Claude response through inference.local after this entry is in place.
…PI build args The onboard pipeline (src/lib/onboard.ts -> resolveSandboxBuildArgs) sets these on every sandbox build: NEMOCLAW_PROVIDER_KEY=anthropic (or openai/gemini/inference/...) NEMOCLAW_PRIMARY_MODEL_REF=<provider>/<model> NEMOCLAW_INFERENCE_API=<api flavour> (anthropic-messages, openai-completions, ...) NEMOCLAW_PROVIDER_KEY was already declared as ARG and ENV. The other two were missing, so the runtime image saw blank values regardless of what the onboard wizard configured. NEMOCLAW_INFERENCE_API in particular is how generate-config.ts decides whether the OpenShell proxy should expect Anthropic Messages format vs OpenAI Chat Completions; without it, every non-default sandbox spoke the wrong protocol shape upstream. This commit only declares the ARGs and promotes them to ENV. The generate-config.ts changes that consume them are in the next commit.
…tial placeholder
generate-config.ts previously hardcoded `provider: "custom"` in the
generated Hermes config.yaml regardless of which provider the user
selected during onboard. This works for OpenAI-compatible providers
because Hermes's "custom" path speaks OpenAI Chat Completions, which
matches what OpenShell's openai-prod / nvidia-prod / gemini-api
adapters accept. It is wrong for Anthropic — the OpenShell
anthropic-prod provider only accepts Anthropic Messages format
(POST /v1/messages), which Hermes only emits when its config has
`provider: anthropic`.
Concretely, an Anthropic-onboard sandbox produced HTTP 400/403 errors
from the OpenShell proxy on every chat completion attempt. The proxy
saw an OpenAI-shaped POST /chat/completions request hitting
inference.local and rejected it because "no compatible inference route
available". This was the second blocker after the missing
managed_inference policy.
Changes:
1. Add a mapProvider() function that turns NEMOCLAW_PROVIDER_KEY (set by
src/lib/onboard-providers.ts -> getSandboxInferenceConfig) into the
correct Hermes-side provider value:
anthropic -> provider: anthropic
openai -> provider: openai
inference (gemini/nvidia/custom-OpenAI) -> provider: custom
custom (legacy/default) -> provider: custom
2. Add PROVIDER_API_KEY_ENV to emit a per-provider credential placeholder
line in the generated .env. Hermes's provider adapters read the env
var (e.g. anthropic_adapter.py reads ANTHROPIC_API_KEY) before
sending the request and short-circuit with an in-process
"credentials missing" error if the env var is empty — even when the
actual outbound call goes through OpenShell's L7 proxy with a
substituted credential.
We satisfy the in-process check by writing
ANTHROPIC_API_KEY=openshell:resolve:env:ANTHROPIC_API_KEY (etc).
The proxy rewrites the real header at egress, identical to the
pattern already used for messaging tokens. Verified empirically:
any non-empty x-api-key header is overridden by the proxy before
leaving the cluster, so the placeholder string itself never reaches
the upstream provider.
Custom / inference providers do not get an emitted line because
they do not have a fixed credential env var name in Hermes; that
path expects either no key (proxy-handled) or a per-user override
passed through onboard.
3. Add four parameterised test cases covering anthropic / openai /
inference / legacy-custom mappings. Each verifies both the YAML
provider value and the .env credential placeholder (or its
absence). The existing Telegram DM test still asserts
provider: "custom" for the no-NEMOCLAW_PROVIDER_KEY default, so
onboard flows that don't set the var keep their current
behaviour.
Verified end-to-end: a freshly onboarded Anthropic sandbox now
returns real Claude responses through
POST http://127.0.0.1:8642/v1/chat/completions.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Three surgical fixes that make the Hermes Agent integration in NemoClaw actually serve a chat completion. Pre-fix it was returning
HTTP 403 connection not allowed by policyon every request — first time this has worked end-to-end since the v0.11.0 bump.Linear: NS-324
Commits (read in order)
1.
managed_inferencenetwork policyagents/hermes/policy-additions.yamlhad no entry forinference.local— the OpenShell gateway's virtual provider router. So every Hermes-to-provider call was rejected at the L7 proxy before it could reach the upstream provider. Added amanaged_inferenceblock mirroring the OpenClaw baseline but with Hermes-side binaries (/usr/local/bin/hermes,/usr/bin/python3{,.11},curl).2. Wire missing Dockerfile build args
NEMOCLAW_PROVIDER_KEYwas already declared asARG/ENV.NEMOCLAW_PRIMARY_MODEL_REFandNEMOCLAW_INFERENCE_APIwere set by the onboard pipeline (src/lib/onboard.ts → resolveSandboxBuildArgs) but never declared in the Hermes Dockerfile, so the runtime image saw blank values regardless of what onboard configured.3. Map
NEMOCLAW_PROVIDER_KEYto Hermes provider with credential placeholderagents/hermes/generate-config.tspreviously hardcodedprovider: "custom"regardless of the chosen provider. Works for OpenAI-compatible flows because Hermes'scustompath speaks OpenAI Chat Completions — broken for Anthropic, because OpenShell'santhropic-prodprovider only accepts Anthropic Messages format (POST /v1/messages).The fix has two parts:
a. Provider mapping
b. Credential placeholder
Hermes's provider adapters (e.g.
anthropic_adapter.py) read the per-provider env var (e.g.ANTHROPIC_API_KEY) before sending and short-circuit with "credentials missing" if empty — even when the actual outbound call goes through OpenShell's L7 proxy with a substituted credential. Fix: emitANTHROPIC_API_KEY=openshell:resolve:env:ANTHROPIC_API_KEYplaceholder in.env. The proxy rewrites the real header at egress (verified empirically — any non-emptyx-api-keyworks, the proxy overrides it). Same pattern as messaging tokens.4 new test cases in
test/hermes-generate-config.test.tscover anthropic / openai / inference / legacy-custom mappings (5 tests pass).Verification
End-to-end verified against Anthropic —
POST http://127.0.0.1:8642/v1/chat/completionsto a freshly-onboarded sandbox returns real Claude Opus 4.7 responses throughinference.local.Out of scope (deliberately)
Nous Portal as a NemoClaw provider — separate ticket NS-322. That's the fast-follow PR. Will add the
nousprovider option insrc/lib/onboard-providers.ts, thenousmapping inmapProvider(), the OAuth handoff, and the corresponding policy hosts (portal.nousresearch.com,inference-api.nousresearch.com, and the 5×<vendor>-gateway.nousresearch.comfor the Tool Gateway).socat workaround removal — Hermes v2026.4.23 supports
API_SERVER_HOST=0.0.0.0env override (the comment instart.shsaying "Hermes binds to 127.0.0.1 regardless of config (upstream bug)" is stale; the env-var override has worked since at least v2026.4.8). Removing it cleanly requires auto-generating anAPI_SERVER_KEY(Hermes refuses 0.0.0.0 binding without one), wiringgateway-tokenfor Hermes (currently OpenClaw-only — readsopenclaw.json), and coordinating dashboard-port detection between nemoclaw and the start script (currentlyPUBLIC_PORT=8642is hardcoded). Filing a follow-up ticket.Other observations (filing follow-ups, not in scope here)
nemoclaw rebuilddoesn't detectDockerfile.basecontent drift —agent-onboard.tsskips rebuilding the base image if it exists locally. When the HA pin inDockerfile.basewas bumped to v2026.4.23, users with cached older bases stayed on the old version even afterrebuild. Workaround:docker rmi … && nemoclaw rebuild. CI rebuilds and pushes:lateston Dockerfile.base changes so production users get it viadocker pullautomatically; only local devs hit this.Port forward dies during rebuild —
openshell forward listshows the gateway portdeadafter every rebuild. Manualforward stop+forward startrecovers. Probably a known race in nemoclaw's lifecycle._skill_commands.clear()monkey-patch inagents/hermes/plugin/__init__.py— depends on hermes-agent#17670 landing for a public API replacement. Will reland once that PR is merged.