fix(shields): exec direct sandbox container on VM driver (#4245)#4290
fix(shields): exec direct sandbox container on VM driver (#4245)#4290yimoj wants to merge 2 commits into
Conversation
`nemoclaw <sandbox> shields up/down` on macOS Docker Desktop with the OpenShell VM driver routed privileged execs through `docker exec openshell-cluster-nemoclaw kubectl exec ...`. VM-driver gateways have no k3s cluster container, so shields could neither lock nor unlock config. The host-side config writer in `sandbox/config.ts` had the same `openshellDriver === "docker"` gate and was broken on the same path. Extract a shared `privileged-container` resolver and argv dispatcher that treats both `docker` and `vm` as direct-container drivers and keeps the legacy `openshell-cluster-nemoclaw` kubectl path only for gateways that actually use it. Preserve longest-known-name disambiguation so sandbox names that are prefixes of other sandbox names (`my` vs `my-assistant`) cannot steal each other's containers. Add unit coverage: - shields and config `privilegedSandboxExecArgv` on `vm` and `docker` drivers target `docker exec --user root <container> ...` - legacy `kubernetes` driver still threads `-i` to both docker exec and kubectl exec - prefix-collision sandbox names fall back to kubectl when no exact container exists Closes NVIDIA#4245 Signed-off-by: Yimo Jiang <yimoj@nvidia.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughExtracts privileged sandbox container resolution and exec-argv construction into a shared helper that detects direct-container drivers (docker, vm), resolves target containers, builds docker exec or legacy kubectl exec argv, and updates shields and config to delegate to the new utilities. ChangesPrivileged Container Execution Refactoring
Sequence DiagramsequenceDiagram
participant Shields as shields.privilegedSandboxExecArgv
participant Shared as privileged-container
participant Registry as Registry
participant Docker as Docker
participant DirectExec as docker exec<br/>--user root
participant GatewayExec as docker exec + kubectl<br/>legacy gateway
Shields->>Shared: buildPrivilegedExecArgv(sandbox, cmd, directContainer)
Shared->>Shared: isDirectContainerDriver(driver)?
alt Direct Driver (docker/vm)
Shared->>Registry: Query sandbox driver
Shared->>Docker: docker ps (running containers)
Shared->>Shared: selectPrivilegedSandboxContainer<br/>(resolve target container)
Shared->>DirectExec: buildDirectContainerExecArgv<br/>(docker exec --user root)
else Legacy Driver (k3s or unknown)
Shared->>GatewayExec: buildKubectlExecArgv<br/>(docker exec + kubectl)
end
Shared-->>Shields: argv: string[]
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 ESLint
ESLint skipped: no ESLint configuration detected in root package.json. To enable, add Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@test/shields-vm-driver-argv.test.ts`:
- Around line 47-65: stubRegistry currently mocks registry.getSandbox to only
return a sandbox when the exact name equals "my-assistant", which makes prefix
lookups (e.g., "my") fail and produces false-positive fallback behavior; update
the mock implementation in stubRegistry so registry.getSandbox(name) returns the
sandbox when the name equals the sandboxName OR when the provided name is a
prefix of sandboxName (also ensure listSandboxes returns the provided
opts.sandboxNames mapped to objects with openshellDriver), and keep the
dockerRun.dockerCapture mock returning opts.containerNames; update references:
stubRegistry, registry.getSandbox, registry.listSandboxes,
dockerRun.dockerCapture.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 8668daec-f148-4d3c-9ab7-f2dff174222a
📒 Files selected for processing (6)
src/lib/sandbox/config.tssrc/lib/sandbox/privileged-container.tssrc/lib/shields/index.tstest/config-set.test.tstest/privileged-container.test.tstest/shields-vm-driver-argv.test.ts
…test CodeRabbit on NVIDIA#4290 pointed out that `stubRegistry` only registered a sandbox entry for `"my-assistant"`. In the prefix-collision case the test looks up `"my"`, so `registry.getSandbox("my")` returned null and `resolvePrivilegedSandboxContainer` short-circuited before `selectPrivilegedSandboxContainer`'s longest-known-name heuristic ever ran. The test was passing for the wrong reason. Mock `getSandbox` for every name in `sandboxNames` so the prefix test actually hits the disambiguation logic and would catch a regression in the longest-known-name selector. Also drop the unused `beforeEach` import flagged by CodeQL. Signed-off-by: Yimo Jiang <yimoj@nvidia.com>
|
Closing — #4287 just landed on For the maintainers' awareness, the follow-up I called out in the PR description still applies after #4287: Signed-off-by: Yimo Jiang yimoj@nvidia.com |
Summary
nemoclaw <sandbox> shields up/downon macOS Docker Desktop with the OpenShell VM driver was routing privileged execs throughdocker exec openshell-cluster-nemoclaw kubectl exec .... VM-driver gateways have no k3s cluster container, so shields could neither lock nor unlock config (and host-initiated config writes hit the same path).Extract a shared resolver/dispatcher so
shields/index.tsandsandbox/config.tsboth treatopenshellDriver === "vm"as a direct-container driver and only fall back to the legacyopenshell-cluster-nemoclawkubectl path for gateways that actually use it.Related Issue
Closes #4245
Changes
src/lib/sandbox/privileged-container.tsexportsselectPrivilegedSandboxContainer,resolvePrivilegedSandboxContainer,buildDirectContainerExecArgv,buildKubectlExecArgv, andbuildPrivilegedExecArgv.dockerandvmare the direct-container drivers; everything else falls back to kubectl through the legacy gateway container.src/lib/shields/index.tsandsrc/lib/sandbox/config.tsdelegate privileged-exec resolution to the shared module so they cannot drift on driver gating or argv shape.myvsmy-assistant) cannot steal the longer sandbox's container; the exactopenshell-<name>always beats a suffixed sibling.test/privileged-container.test.ts(20 tests) — pure selector, argv builders, driver-class detection, prefix-collision and exact-name precedence.test/shields-vm-driver-argv.test.ts(6 tests) — integration coverage provingshields.privilegedSandboxExecArgvandsandbox/config.privilegedSandboxExecArgvroute todocker exec --user root <openshell-sandbox-container> ...onvm/dockerdrivers, fall back to kubectl forkubernetes, and refuse to steal a prefix-sibling's container.test/config-set.test.ts— two additional VM-driver cases on the existingselectDockerDriverSandboxContainerhelper.Type of Change
Verification
npm run typecheck:cliclean.test/privileged-container.test.ts,test/shields-vm-driver-argv.test.ts,test/config-set.test.ts).src/lib/shields/,src/lib/sandbox/,test/rebuild-shields-*,test/config-*,test/privileged-container,test/shields-vm-driver-argv).npx prek run --all-filesnot run in this worktree (hooks-managed; happens at commit).Follow-up out of scope
src/lib/actions/sandbox/host-aliases.tshardcodes the sameK3S_CONTAINER = "openshell-cluster-nemoclaw"and unconditionally routeskubectlthrough it. That command is going to fail the same way on a VM-driver gateway. Filing a separate follow-up rather than expanding the scope of this fix.🤖 Generated with Claude Code
Signed-off-by: Yimo Jiang yimoj@nvidia.com
Summary by CodeRabbit