Skip to content

bug: gateway startup fails when triggered via SSH → docker exec → kubectl exec chain (setsid does not survive) #1175

@mahu888

Description

@mahu888

Description

When the NemoClaw gateway is started through a nested execution chain (SSH → docker exec into the OpenShell k3s container → kubectl exec into the agent pod), the gateway process exits immediately after the kubectl exec session closes.

Expected: Gateway stays running as a background daemon after the startup command returns.

Actual: Gateway process receives SIGHUP and exits when the kubectl exec PTY closes. Port 18789 is not bound after the command returns.

Reproduction Steps

  1. Set up NemoClaw on a remote EC2 instance (OpenShell 0.0.16, k3s cluster inside Docker container openshell-cluster-nemoclaw)
  2. SSH into the EC2 host
  3. Attempt to start the gateway via the nested exec chain:
ssh ubuntu@<host> "docker exec openshell-cluster-nemoclaw \
  kubectl exec -n openshell my-assistant -c agent -- \
  sh -c 'setsid gosu gateway openclaw gateway run &'"
  1. Wait for the SSH session to return
  2. Check if port 18789 is bound:
ssh ubuntu@<host> "docker exec openshell-cluster-nemoclaw \
  kubectl exec -n openshell my-assistant -c agent -- \
  ss -tlnp | grep 18789"

→ No output. Gateway has exited.

Environment

  • Host OS: Ubuntu 22.04 (AWS EC2)
  • Docker Engine: 27.x
  • OpenShell: 0.0.16
  • NemoClaw: v0.1.0
  • k3s running inside Docker container openshell-cluster-nemoclaw
  • Gateway triggered remotely via SSH → docker exec → kubectl exec

Debug Output

Logs

**Workaround:** Use `docker exec -d` (detached) from the host level instead, which lets Docker daemon own the process and prevents SIGHUP propagation:


ssh ubuntu@<host> "docker exec -d openshell-cluster-nemoclaw \
  kubectl exec -n openshell my-assistant -c agent -- \
  sh -c 'gosu gateway openclaw gateway run >> /tmp/openclaw-gateway.log 2>&1'"


This reliably keeps the gateway alive because detachment happens at the Docker layer, outside the kubectl exec PTY lifecycle.

**Root cause:** `setsid` creates a new session inside the container but the controlling terminal of the `kubectl exec` PTY is still inherited in some k3s configurations, causing SIGHUP to reach the new session when the exec session closes.

Checklist

  • I confirmed this bug is reproducible
  • I searched existing issues and this is not a duplicate

Metadata

Metadata

Assignees

No one assigned

    Labels

    NemoClaw CLIUse this label to identify issues with the NemoClaw command-line interface (CLI).bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions