fix(hermes): unblock Hermes Agent inference end-to-end (NS-324) by shannonsands · Pull Request #1 · NousResearch/NemoClaw

shannonsands · 2026-04-30T04:56:22Z

Summary

Three surgical fixes that make the Hermes Agent integration in NemoClaw actually serve a chat completion. Pre-fix it was returning HTTP 403 connection not allowed by policy on every request — first time this has worked end-to-end since the v0.11.0 bump.

Linear: NS-324

Commits (read in order)

1. `managed_inference` network policy

agents/hermes/policy-additions.yaml had no entry for inference.local — the OpenShell gateway's virtual provider router. So every Hermes-to-provider call was rejected at the L7 proxy before it could reach the upstream provider. Added a managed_inference block mirroring the OpenClaw baseline but with Hermes-side binaries (/usr/local/bin/hermes, /usr/bin/python3{,.11}, curl).

2. Wire missing Dockerfile build args

NEMOCLAW_PROVIDER_KEY was already declared as ARG/ENV. NEMOCLAW_PRIMARY_MODEL_REF and NEMOCLAW_INFERENCE_API were set by the onboard pipeline (src/lib/onboard.ts → resolveSandboxBuildArgs) but never declared in the Hermes Dockerfile, so the runtime image saw blank values regardless of what onboard configured.

3. Map `NEMOCLAW_PROVIDER_KEY` to Hermes provider with credential placeholder

agents/hermes/generate-config.ts previously hardcoded provider: "custom" regardless of the chosen provider. Works for OpenAI-compatible flows because Hermes's custom path speaks OpenAI Chat Completions — broken for Anthropic, because OpenShell's anthropic-prod provider only accepts Anthropic Messages format (POST /v1/messages).

The fix has two parts:

a. Provider mapping

function mapProvider(providerKey: string): string {
  switch (providerKey) {
    case "anthropic": return "anthropic";
    case "openai":    return "openai";
    case "inference":           // gemini/nvidia/custom-OpenAI
    case "custom":              // legacy/default
    default:          return "custom";
  }
}

b. Credential placeholder

Hermes's provider adapters (e.g. anthropic_adapter.py) read the per-provider env var (e.g. ANTHROPIC_API_KEY) before sending and short-circuit with "credentials missing" if empty — even when the actual outbound call goes through OpenShell's L7 proxy with a substituted credential. Fix: emit ANTHROPIC_API_KEY=openshell:resolve:env:ANTHROPIC_API_KEY placeholder in .env. The proxy rewrites the real header at egress (verified empirically — any non-empty x-api-key works, the proxy overrides it). Same pattern as messaging tokens.

4 new test cases in test/hermes-generate-config.test.ts cover anthropic / openai / inference / legacy-custom mappings (5 tests pass).

Verification

End-to-end verified against Anthropic — POST http://127.0.0.1:8642/v1/chat/completions to a freshly-onboarded sandbox returns real Claude Opus 4.7 responses through inference.local.

{"id": "chatcmpl-...", "object": "chat.completion", "model": "claude-opus-4-7",
 "choices": [{"message": {"role": "assistant", "content": "pong"}}],
 "usage": {"prompt_tokens": 14842, "completion_tokens": 6}}

Out of scope (deliberately)

Nous Portal as a NemoClaw provider — separate ticket NS-322. That's the fast-follow PR. Will add the nous provider option in src/lib/onboard-providers.ts, the nous mapping in mapProvider(), the OAuth handoff, and the corresponding policy hosts (portal.nousresearch.com, inference-api.nousresearch.com, and the 5× <vendor>-gateway.nousresearch.com for the Tool Gateway).
socat workaround removal — Hermes v2026.4.23 supports API_SERVER_HOST=0.0.0.0 env override (the comment in start.sh saying "Hermes binds to 127.0.0.1 regardless of config (upstream bug)" is stale; the env-var override has worked since at least v2026.4.8). Removing it cleanly requires auto-generating an API_SERVER_KEY (Hermes refuses 0.0.0.0 binding without one), wiring gateway-token for Hermes (currently OpenClaw-only — reads openclaw.json), and coordinating dashboard-port detection between nemoclaw and the start script (currently PUBLIC_PORT=8642 is hardcoded). Filing a follow-up ticket.

Other observations (filing follow-ups, not in scope here)

nemoclaw rebuild doesn't detect Dockerfile.base content drift — agent-onboard.ts skips rebuilding the base image if it exists locally. When the HA pin in Dockerfile.base was bumped to v2026.4.23, users with cached older bases stayed on the old version even after rebuild. Workaround: docker rmi … && nemoclaw rebuild. CI rebuilds and pushes :latest on Dockerfile.base changes so production users get it via docker pull automatically; only local devs hit this.
Port forward dies during rebuild — openshell forward list shows the gateway port dead after every rebuild. Manual forward stop + forward start recovers. Probably a known race in nemoclaw's lifecycle.
_skill_commands.clear() monkey-patch in agents/hermes/plugin/__init__.py — depends on hermes-agent#17670 landing for a public API replacement. Will reland once that PR is merged.

… routing All Hermes inference traffic goes through https://inference.local — the OpenShell gateway's virtual provider router that terminates TLS, swaps in the host-side credential for the selected provider, and forwards to the real backend. The existing policy-additions.yaml had no entry for inference.local, so every chat completion was rejected with HTTP 403 'connection not allowed by policy' before reaching OpenShell at all. Direct verification against api.anthropic.com / inference-api.nvidia.com worked for those specific hosts, but the inference.local short-circuit (which is the actual runtime path for every Hermes-to-provider call under NemoClaw) was unreachable. Mirrors the equivalent block in nemoclaw-blueprint/policies/openclaw-sandbox.yaml but with Hermes-side binaries (/usr/local/bin/hermes, /usr/bin/python3, /usr/bin/python3.11, plus curl for diagnostics). Verified via end-to-end Anthropic chat completion: real Claude response through inference.local after this entry is in place.

…PI build args The onboard pipeline (src/lib/onboard.ts -> resolveSandboxBuildArgs) sets these on every sandbox build: NEMOCLAW_PROVIDER_KEY=anthropic (or openai/gemini/inference/...) NEMOCLAW_PRIMARY_MODEL_REF=<provider>/<model> NEMOCLAW_INFERENCE_API=<api flavour> (anthropic-messages, openai-completions, ...) NEMOCLAW_PROVIDER_KEY was already declared as ARG and ENV. The other two were missing, so the runtime image saw blank values regardless of what the onboard wizard configured. NEMOCLAW_INFERENCE_API in particular is how generate-config.ts decides whether the OpenShell proxy should expect Anthropic Messages format vs OpenAI Chat Completions; without it, every non-default sandbox spoke the wrong protocol shape upstream. This commit only declares the ARGs and promotes them to ENV. The generate-config.ts changes that consume them are in the next commit.

…tial placeholder generate-config.ts previously hardcoded `provider: "custom"` in the generated Hermes config.yaml regardless of which provider the user selected during onboard. This works for OpenAI-compatible providers because Hermes's "custom" path speaks OpenAI Chat Completions, which matches what OpenShell's openai-prod / nvidia-prod / gemini-api adapters accept. It is wrong for Anthropic — the OpenShell anthropic-prod provider only accepts Anthropic Messages format (POST /v1/messages), which Hermes only emits when its config has `provider: anthropic`. Concretely, an Anthropic-onboard sandbox produced HTTP 400/403 errors from the OpenShell proxy on every chat completion attempt. The proxy saw an OpenAI-shaped POST /chat/completions request hitting inference.local and rejected it because "no compatible inference route available". This was the second blocker after the missing managed_inference policy. Changes: 1. Add a mapProvider() function that turns NEMOCLAW_PROVIDER_KEY (set by src/lib/onboard-providers.ts -> getSandboxInferenceConfig) into the correct Hermes-side provider value: anthropic -> provider: anthropic openai -> provider: openai inference (gemini/nvidia/custom-OpenAI) -> provider: custom custom (legacy/default) -> provider: custom 2. Add PROVIDER_API_KEY_ENV to emit a per-provider credential placeholder line in the generated .env. Hermes's provider adapters read the env var (e.g. anthropic_adapter.py reads ANTHROPIC_API_KEY) before sending the request and short-circuit with an in-process "credentials missing" error if the env var is empty — even when the actual outbound call goes through OpenShell's L7 proxy with a substituted credential. We satisfy the in-process check by writing ANTHROPIC_API_KEY=openshell:resolve:env:ANTHROPIC_API_KEY (etc). The proxy rewrites the real header at egress, identical to the pattern already used for messaging tokens. Verified empirically: any non-empty x-api-key header is overridden by the proxy before leaving the cluster, so the placeholder string itself never reaches the upstream provider. Custom / inference providers do not get an emitted line because they do not have a fixed credential env var name in Hermes; that path expects either no key (proxy-handled) or a per-user override passed through onboard. 3. Add four parameterised test cases covering anthropic / openai / inference / legacy-custom mappings. Each verifies both the YAML provider value and the .env credential placeholder (or its absence). The existing Telegram DM test still asserts provider: "custom" for the no-NEMOCLAW_PROVIDER_KEY default, so onboard flows that don't set the var keep their current behaviour. Verified end-to-end: a freshly onboarded Anthropic sandbox now returns real Claude responses through POST http://127.0.0.1:8642/v1/chat/completions.

shannonsands added 3 commits April 30, 2026 14:37

shannonsands changed the base branch from main to feat/nemohermes-reference-blueprint April 30, 2026 04:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(hermes): unblock Hermes Agent inference end-to-end (NS-324)#1

fix(hermes): unblock Hermes Agent inference end-to-end (NS-324)#1
shannonsands wants to merge 3 commits into
feat/nemohermes-reference-blueprintfrom
ns324/hermes-inference-fixes

shannonsands commented Apr 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

shannonsands commented Apr 30, 2026

Summary

Commits (read in order)

1. managed_inference network policy

2. Wire missing Dockerfile build args

3. Map NEMOCLAW_PROVIDER_KEY to Hermes provider with credential placeholder

Verification

Out of scope (deliberately)

Other observations (filing follow-ups, not in scope here)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

1. `managed_inference` network policy

3. Map `NEMOCLAW_PROVIDER_KEY` to Hermes provider with credential placeholder