Skip to content

feat(box): --cap-add for fine-grained capability control#597

Draft
G4614 wants to merge 2 commits into
boxlite-ai:mainfrom
G4614:feat/cap-add
Draft

feat(box): --cap-add for fine-grained capability control#597
G4614 wants to merge 2 commits into
boxlite-ai:mainfrom
G4614:feat/cap-add

Conversation

@G4614
Copy link
Copy Markdown
Contributor

@G4614 G4614 commented May 26, 2026

Adds a repeatable --cap-add <CAP> flag for fine-grained Linux capabilities.

ALL grants the full 41-cap OCI set. Beyond the capability list, granting SYS_ADMIN/ALL flips the container into the privileged shape that workloads like dockerd need:

  • a rw cgroup2 mount at /sys/fs/cgroup (so dockerd can manage cgroups), and
  • a writable /proc/sysbuild_linux_spec clears the OCI default readonlyPaths/maskedPaths so dockerd can write net.ipv4.ip_forward to bring up its bridge (matching docker --privileged).

This PR is capability control only — no kernel selection (that lives in #596). Running docker-in-docker therefore needs #596 stacked: #596 supplies the --kernel net blob with netfilter/iptables; #597 supplies the caps + privileged mounts//proc/sys. With both, boxlite run --kernel net --cap-add ALL docker:dind starts dockerd end-to-end.

Test plan

Unit tests on the cap → spec boundary:

  • cli.rs: cap_add_propagates_to_options, cap_add_all_propagates, no_cap_add_leaves_empty--cap-add reaches BoxOptions.added_caps.
  • spec.rs: build_capabilities_all_grants_dangerous_caps — the ALL effective set includes SYS_ADMIN/NET_ADMIN. Regression guard: the original ALL branch double-prefixed cap names (CAP_CAP_*), so every insert errored and ALL silently degraded to the 14-cap default — missing exactly the caps dockerd needs.
  • spec.rs: privileged_linux_spec_clears_readonly_and_masked_paths — privileged clears the /proc/sys hardening; non-privileged keeps the OCI defaults.

Verified on this host:

  • cargo nextest run -p boxlite-guest (cap/spec tests) + -p boxlite-cli (cap tests) — pass.
  • make fmt:check:rust + cargo clippy --all-targets -- -D warnings — clean.
  • e2e (stacked with feat(kernel): build-time lean/net kernel selection #596 + a make libkrunfw-net blob): boxlite run --kernel net --cap-add ALL docker:dind → dockerd reaches Daemon has completed initialization + API listen on /var/run/docker.sock; docker version shows client+server, docker info shows overlayfs storage + cgroup v2.

Before, --cap-add ALL was a no-op (the double-CAP_ prefix dropped every cap) and /proc/sys stayed read-only, so dockerd couldn't start. After, full caps + writable /proc/sys + cgroup-rw let dind run — once #596's net kernel is stacked underneath.

🤖 Generated with Claude Code

@G4614 G4614 marked this pull request as draft May 27, 2026 06:48
@G4614 G4614 force-pushed the feat/cap-add branch 5 times, most recently from e3f9692 to d2cc206 Compare May 28, 2026 13:51
gamnaansong and others added 2 commits May 29, 2026 04:02
Repeatable `--cap-add CAP` flag adds individual Linux capabilities
to the container. "ALL" grants every cap. CAP_SYS_ADMIN triggers
cgroup2 rw mount automatically.

Wired end-to-end: CLI → BoxOptions.added_caps → proto → guest OCI spec.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
cap-add's ALL expansion was a no-op: build_capabilities double-prefixed
cap names (format!("CAP_{name}") over capability_names(), which already
returns "CAP_*"), so every Capability::from_str("CAP_CAP_*") errored and
the container silently kept only the 14-cap default set — missing
SYS_ADMIN/NET_ADMIN that privileged workloads need. Replace the broken
loop with all_capabilities() enumerating all 41 OCI capabilities.

Privileged containers (cap-add ALL/SYS_ADMIN) also need a writable
/proc/sys: dockerd writes net.ipv4.ip_forward to bring up its bridge.
LinuxBuilder otherwise defaults readonlyPaths/maskedPaths to the OCI
lists, which remount /proc/sys read-only — override them with empty sets
when privileged (matching `docker --privileged`). Non-privileged boxes
keep the protective defaults.

Together these let `boxlite run --kernel net --cap-add ALL docker:dind`
start dockerd end-to-end: Daemon initialized, API listening on
/var/run/docker.sock, `docker version` client+server, overlayfs storage,
cgroup v2. Adds guest unit tests for both (all_capabilities superset,
ALL effective set includes SYS_ADMIN/NET_ADMIN, privileged spec clears
readonly/masked paths while non-privileged keeps /proc/sys read-only).

Also rustfmt the kernel-net build.rs constants/Fetcher calls, committed
unformatted in the kernel-selection commit, so the branch passes fmt.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant