You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat(sandbox): add callable python exec API and refresh e2e coverage (!19)
ClosesNVIDIA#13
## Summary
- add Python sandbox execution APIs for command and callable workflows
- consolidate sandbox policy fixtures and expand e2e test coverage for policy and Python exec paths
- update CI/build config and images for sandbox e2e execution dependencies
## Test Plan
- mise run pre-commit
| Port conflict | Another service on 6443/80/443/30051 | Stop conflicting service or change port mapping |
291
+
| gRPC connect refused to `127.0.0.1:443` in CI | Docker daemon is remote (`DOCKER_HOST=tcp://...`) but metadata still points to loopback | Verify metadata endpoint host matches `DOCKER_HOST` and includes non-loopback host |
282
292
| DNS failures inside container | Entrypoint DNS detection failed | Check `/etc/rancher/k3s/resolv.conf` and container startup logs |
283
293
|`metrics-server` errors in logs | Normal k3s noise, not the root cause | These errors are benign — look for the actual failing health check component |
284
294
| Stale NotReady nodes from previous deploys | Volume reused across container recreations | The deploy flow now auto-cleans stale nodes; if it still fails, manually delete NotReady nodes (see Step 3) or choose "Recreate" when prompted |
295
+
| gRPC `UNIMPLEMENTED` for newer RPCs in push mode | Helm values still point at older pulled images instead of the pushed refs | Verify rendered `navigator-helmchart.yaml` uses the expected push refs (`server`, `sandbox`, `pki-job`) and not `:latest`|
When `NAVIGATOR_PUSH_IMAGES` is enabled, the entrypoint rewrites HelmChart image tags from `latest` to `IMAGE_TAG` and now handles both quoted and unquoted `tag: latest` formats.
309
+
In push mode, bootstrap also passes exact imported image refs (`server`, `sandbox`, `pki-job`) to the entrypoint, which rewrites Helm values to those refs directly before the tag/pull-policy overrides. Image import uses the `k8s.io` containerd namespace so kubelet resolves the pushed refs without falling back to pulled registry tags. After import, bootstrap restarts `deployment/navigator` and waits for rollout completion so running pods pick up the imported image references.
Copy file name to clipboardExpand all lines: architecture/containers.md
+2Lines changed: 2 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -57,6 +57,7 @@ The server container runs the Navigator orchestration service.
57
57
- Exposes gRPC/HTTP on port 8080
58
58
- Health checks at `/healthz`
59
59
- SQLx migrations copied from source
60
+
- Uses an embedded Rust SSH client (`russh`) for sandbox exec
60
61
61
62
### navigator-cluster
62
63
@@ -79,6 +80,7 @@ An airgapped k3s image with all components pre-loaded for single-container deplo
79
80
When running k3s in Docker, the container's `/etc/resolv.conf` contains Docker's internal DNS (127.0.0.11), which is not reachable from k3s pods. While k3s auto-detects this and falls back to 8.8.8.8, external UDP traffic doesn't work reliably on Docker Desktop.
80
81
81
82
The `cluster-entrypoint.sh` script solves this by:
83
+
82
84
1. Detecting the Docker host gateway IP from `/etc/hosts` (requires `--add-host=host.docker.internal:host-gateway`)
83
85
2. Writing a custom resolv.conf with the host gateway as the nameserver
84
86
3. Passing `--resolv-conf` to k3s to use this configuration
0 commit comments