You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
fix(security): harden sandbox SSH with mandatory HMAC secret, NetworkPolicy, and nonce replay detection (#127)
* fix(security): harden sandbox SSH with mandatory HMAC secret, NetworkPolicy, and nonce replay detection
ClosesNVIDIA#25
- Make NEMOCLAW_SSH_HANDSHAKE_SECRET mandatory: server and sandbox both
refuse to start if the secret is empty/unset. Cluster deployments
auto-generate it via openssl rand in the entrypoint script.
- Add Kubernetes NetworkPolicy restricting sandbox port 2222 ingress to
the gateway pod only, preventing lateral movement from other cluster
workloads.
- Add NSSH1 nonce replay detection with a TTL-bounded cache, rejecting
replayed handshakes within the timestamp validity window.
- Add unit tests for verify_preface (valid, replay, expired, bad HMAC,
malformed) and env injection.
* fix(deploy): pass sshHandshakeSecret in fast deploy helm upgrade
---------
Co-authored-by: John Myers <johntmyers@users.noreply.github.com>
3. Reject if skew exceeds `ssh_handshake_skew_secs` (default: 300 seconds)
346
346
4. Recompute HMAC-SHA256 over `token|timestamp|nonce` with the shared secret
347
347
5. Compare computed signature against the received signature (constant-time via `hmac` crate)
348
-
6. Respond with `OK\n` on success or `ERR\n` on failure
348
+
6. Check nonce against the replay cache; reject if the nonce has been seen before within the skew window
349
+
7. Insert the nonce into the replay cache on success
350
+
8. Respond with `OK\n` on success or `ERR\n` on failure
351
+
352
+
### Nonce replay detection
353
+
354
+
The SSH server maintains a per-process `NonceCache` (`HashMap<String, Instant>` behind `Arc<Mutex<...>>`) that tracks nonces seen within the handshake skew window. A background tokio task reaps expired entries every 60 seconds. If a valid preface is presented with a previously-seen nonce, the handshake is rejected. This prevents replay attacks within the timestamp validity window.
349
355
350
356
### HMAC computation
351
357
@@ -517,7 +523,12 @@ This function is shared between the CLI and TUI via the `navigator-core::forward
517
523
518
524
1.**mTLS (transport layer)** -- when TLS is configured, the CLI authenticates to the gateway using client certificates. The `ssh-proxy` subprocess inherits TLS options from the parent CLI process.
519
525
2.**Session token (application layer)** -- the gateway validates the session token against the persistence layer. Tokens are scoped to a specific sandbox and can be revoked.
520
-
3.**NSSH1 handshake (gateway-to-sandbox)** -- the shared handshake secret proves the connection originated from an authorized gateway. The timestamp + nonce prevent replay attacks within the skew window.
526
+
3.**NSSH1 handshake (gateway-to-sandbox)** -- the shared handshake secret proves the connection originated from an authorized gateway. The timestamp + nonce prevent replay attacks within the skew window. The nonce replay cache rejects duplicates.
527
+
4.**Kubernetes NetworkPolicy** -- a Helm-managed `NetworkPolicy` restricts ingress to sandbox pods on port 2222 to only the gateway pod, preventing lateral movement from other in-cluster workloads. Controlled by `networkPolicy.enabled` in the Helm values (default: `true`).
528
+
529
+
### Mandatory handshake secret
530
+
531
+
The NSSH1 handshake secret (`NEMOCLAW_SSH_HANDSHAKE_SECRET`) is required. Both the server and sandbox will refuse to start if the secret is empty or unset. For cluster deployments the secret is auto-generated by the entrypoint script (`deploy/docker/cluster-entrypoint.sh`) via `openssl rand -hex 32` and injected into the Helm values.
521
532
522
533
### What SSH auth does NOT enforce
523
534
@@ -542,7 +553,7 @@ The sandbox generates a fresh Ed25519 host key on every startup. The CLI disable
542
553
|`ssh_gateway_port`|`8080`| Public port for gateway connections (0 = use bind port) |
543
554
|`ssh_connect_path`|`/connect/ssh`| HTTP path for CONNECT requests |
544
555
|`sandbox_ssh_port`|`2222`| SSH listen port inside sandbox pods |
545
-
|`ssh_handshake_secret`| (empty) | Shared HMAC key for NSSH1 handshake |
556
+
|`ssh_handshake_secret`| (required) | Shared HMAC key for NSSH1 handshake (server fails to start if empty)|
546
557
|`ssh_handshake_skew_secs`|`300`| Maximum allowed clock skew (seconds) |
0 commit comments