Skip to content

Commit df3f0bc

Browse files
committed
fix(sandbox): refresh docker/podman/vm tokens in gateway
Signed-off-by: Taylor Mutch <taylormutch@gmail.com>
1 parent 5bcc462 commit df3f0bc

42 files changed

Lines changed: 2203 additions & 487 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

Cargo.lock

Lines changed: 1 addition & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

architecture/compute-runtimes.md

Lines changed: 23 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@ reason strings.
2929
| Docker | Local development with Docker available. | Container plus nested sandbox namespace. | Uses host networking so loopback gateway endpoints work from the supervisor. |
3030
| Podman | Rootless or single-machine deployments. | Container plus nested sandbox namespace. | Uses the Podman REST API, OCI image volumes, and CDI GPU devices when available. |
3131
| Kubernetes | Cluster deployment through Helm. | Pod plus nested sandbox namespace. | Uses Kubernetes API objects, service accounts, secrets, PVC-backed workspace storage, and GPU resources. |
32-
| VM | Experimental microVM isolation. | Per-sandbox libkrun VM. | Gateway spawns `openshell-driver-vm` as a subprocess over a private, state-local Unix socket. The VM driver boots a cached bootstrap `rootfs.ext4`, prepares requested OCI images inside a bootstrap VM with `umoci`, attaches the prepared image disk read-only, and gives each sandbox a writable `overlay.ext4` for merged-root changes and runtime material. The driver persists each accepted launch request beside the overlay and restarts those VMs on driver startup without recreating the overlay. |
32+
| VM | Experimental microVM isolation. | Per-sandbox libkrun VM. | Gateway spawns `openshell-driver-vm` as a subprocess over a private, state-local Unix socket. The VM driver boots a cached bootstrap `rootfs.ext4`, prepares requested OCI images inside a bootstrap VM with `umoci`, attaches the prepared image disk read-only, and gives each sandbox a writable `overlay.ext4` for merged-root changes and runtime material. The driver persists each accepted launch request beside the overlay; the gateway explicitly calls the driver's resume RPC on startup so it can supply a fresh sandbox token before the VM is relaunched. |
3333

3434
Per-sandbox CPU and memory values currently enter the driver layer through
3535
template resource limits. Docker and Podman apply them as runtime limits.
@@ -64,6 +64,28 @@ Driver-controlled environment variables must override sandbox image or template
6464
values for sandbox ID, sandbox name, gateway endpoint, relay socket path, TLS
6565
paths, and command metadata.
6666

67+
## Sandbox Tokens
68+
69+
When gateway-minted sandbox JWTs are enabled, each runtime declares its token
70+
contract with `OPENSHELL_SANDBOX_AUTH_MODE`:
71+
72+
- Docker and Podman use `gateway-managed-file`. The gateway writes host token
73+
files that are mounted read-only into the container, and the supervisor
74+
re-reads `OPENSHELL_SANDBOX_TOKEN_FILE` on outbound gateway calls.
75+
- VM uses `gateway-managed-supervisor-push`. The gateway supplies a fresh token
76+
through the driver's resume/write RPCs and sends live token updates over
77+
`ConnectSupervisor` so the guest can rewrite
78+
`/opt/openshell/auth/sandbox.jwt`.
79+
- Kubernetes uses `kubernetes-service-account-exchange`. The supervisor reads
80+
the projected ServiceAccount token from `OPENSHELL_K8S_SA_TOKEN_FILE` and
81+
exchanges it for a gateway JWT with `IssueSandboxToken`.
82+
83+
During startup, local-driver resume hooks receive a freshly minted token before
84+
starting or re-adopting each persisted sandbox. The gateway also runs a refresh
85+
sweep after startup resume and then rotates local-runtime tokens before expiry.
86+
This lets a local sandbox recover after the gateway, container, or VM was
87+
stopped long enough for the previous token to expire.
88+
6789
## Images
6890

6991
The gateway image and Helm chart are built from this repository. Sandbox images

crates/openshell-core/Cargo.toml

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,7 @@ serde = { workspace = true }
2020
serde_json = { workspace = true }
2121
url = { workspace = true }
2222
ipnet = "2"
23+
tempfile = "3"
2324

2425
[features]
2526
## Include test-only settings (dummy_bool, dummy_int) in the registry.
@@ -31,8 +32,5 @@ dev-settings = []
3132
tonic-build = { workspace = true }
3233
protobuf-src = { workspace = true }
3334

34-
[dev-dependencies]
35-
tempfile = "3"
36-
3735
[lints]
3836
workspace = true

crates/openshell-core/src/paths.rs

Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -116,6 +116,38 @@ pub fn ensure_parent_dir_restricted(path: &Path) -> Result<()> {
116116
Ok(())
117117
}
118118

119+
/// Atomically write a sensitive file with owner-only read/write permissions.
120+
///
121+
/// The parent directory is created with [`create_dir_restricted`]. The content
122+
/// is written to a sibling temporary file, synced, chmodded to `0o600` on Unix,
123+
/// and then renamed into place.
124+
pub fn write_file_owner_only_atomic(path: &Path, contents: &[u8]) -> Result<()> {
125+
let parent = path
126+
.parent()
127+
.ok_or_else(|| miette::miette!("path has no parent: {}", path.display()))?;
128+
create_dir_restricted(parent)?;
129+
let mut temp = tempfile::Builder::new()
130+
.prefix(".openshell-")
131+
.tempfile_in(parent)
132+
.into_diagnostic()
133+
.wrap_err_with(|| format!("failed to create temp file in {}", parent.display()))?;
134+
135+
std::io::Write::write_all(&mut temp, contents)
136+
.into_diagnostic()
137+
.wrap_err_with(|| format!("failed to write temp file for {}", path.display()))?;
138+
temp.as_file()
139+
.sync_all()
140+
.into_diagnostic()
141+
.wrap_err_with(|| format!("failed to sync temp file for {}", path.display()))?;
142+
set_file_owner_only(temp.path())?;
143+
temp.persist(path)
144+
.map_err(|err| err.error)
145+
.into_diagnostic()
146+
.wrap_err_with(|| format!("failed to rename temp file into {}", path.display()))?;
147+
set_file_owner_only(path)?;
148+
Ok(())
149+
}
150+
119151
/// Check whether a file has permissions that are too open (group/other readable).
120152
///
121153
/// Returns `true` if the file has group or other read/write/execute bits set.
@@ -180,6 +212,22 @@ mod tests {
180212
assert_eq!(mode, 0o600, "expected 0600, got {mode:04o}");
181213
}
182214

215+
#[test]
216+
fn write_file_owner_only_atomic_replaces_contents() {
217+
let tmp = tempfile::tempdir().unwrap();
218+
let file = tmp.path().join("nested").join("secret");
219+
write_file_owner_only_atomic(&file, b"first\n").unwrap();
220+
write_file_owner_only_atomic(&file, b"second\n").unwrap();
221+
222+
assert_eq!(std::fs::read_to_string(&file).unwrap(), "second\n");
223+
#[cfg(unix)]
224+
{
225+
use std::os::unix::fs::PermissionsExt;
226+
let mode = std::fs::metadata(&file).unwrap().permissions().mode() & 0o777;
227+
assert_eq!(mode, 0o600, "expected 0600, got {mode:04o}");
228+
}
229+
}
230+
183231
#[cfg(unix)]
184232
#[test]
185233
fn is_file_permissions_too_open_detects_world_readable() {

crates/openshell-core/src/sandbox_env.rs

Lines changed: 75 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -35,21 +35,87 @@ pub const TLS_CERT: &str = "OPENSHELL_TLS_CERT";
3535
/// Path to the private key for mTLS communication with the gateway.
3636
pub const TLS_KEY: &str = "OPENSHELL_TLS_KEY";
3737

38-
/// Raw gateway-minted JWT identifying this sandbox. Mutually exclusive with
39-
/// [`SANDBOX_TOKEN_FILE`] / [`K8S_SA_TOKEN_FILE`]; used only by test harnesses
40-
/// that bypass the file-mount path.
38+
/// Selects how the supervisor bootstraps sandbox authentication and who owns
39+
/// token refresh.
40+
pub const SANDBOX_AUTH_MODE: &str = "OPENSHELL_SANDBOX_AUTH_MODE";
41+
42+
/// Explicit sandbox authentication modes.
43+
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
44+
pub enum SandboxAuthMode {
45+
/// Use [`SANDBOX_TOKEN`] as a static gateway JWT.
46+
///
47+
/// This is intended for direct test/debug harnesses. The supervisor does
48+
/// not refresh the token.
49+
StaticToken,
50+
51+
/// Use [`SANDBOX_TOKEN_FILE`] as a gateway-managed token file.
52+
///
53+
/// Docker and Podman use this mode. The gateway refreshes the host-side
54+
/// file and the supervisor re-reads it on outbound calls.
55+
GatewayManagedFile,
56+
57+
/// Use [`SANDBOX_TOKEN_FILE`] as a supervisor-writable token file.
58+
///
59+
/// The VM driver uses this mode. The gateway injects a fresh token into
60+
/// persisted VM state on resume and pushes live token updates over the
61+
/// supervisor control stream.
62+
GatewayManagedSupervisorPush,
63+
64+
/// Use [`K8S_SA_TOKEN_FILE`] to exchange Kubernetes workload identity for
65+
/// a gateway JWT.
66+
///
67+
/// The supervisor re-exchanges the projected `ServiceAccount` token when
68+
/// the gateway JWT needs rotation.
69+
KubernetesServiceAccountExchange,
70+
}
71+
72+
impl SandboxAuthMode {
73+
#[must_use]
74+
pub const fn as_str(self) -> &'static str {
75+
match self {
76+
Self::StaticToken => "static-token",
77+
Self::GatewayManagedFile => "gateway-managed-file",
78+
Self::GatewayManagedSupervisorPush => "gateway-managed-supervisor-push",
79+
Self::KubernetesServiceAccountExchange => "kubernetes-service-account-exchange",
80+
}
81+
}
82+
83+
#[must_use]
84+
pub fn allowed_values() -> &'static str {
85+
"static-token, gateway-managed-file, gateway-managed-supervisor-push, kubernetes-service-account-exchange"
86+
}
87+
}
88+
89+
impl std::str::FromStr for SandboxAuthMode {
90+
type Err = String;
91+
92+
fn from_str(value: &str) -> Result<Self, Self::Err> {
93+
match value {
94+
"static-token" => Ok(Self::StaticToken),
95+
"gateway-managed-file" => Ok(Self::GatewayManagedFile),
96+
"gateway-managed-supervisor-push" => Ok(Self::GatewayManagedSupervisorPush),
97+
"kubernetes-service-account-exchange" => Ok(Self::KubernetesServiceAccountExchange),
98+
other => Err(format!(
99+
"invalid sandbox auth mode '{other}' (expected one of: {})",
100+
Self::allowed_values()
101+
)),
102+
}
103+
}
104+
}
105+
106+
/// Raw gateway-minted JWT identifying this sandbox. Used only when
107+
/// [`SANDBOX_AUTH_MODE`] is [`SandboxAuthMode::StaticToken`].
41108
pub const SANDBOX_TOKEN: &str = "OPENSHELL_SANDBOX_TOKEN";
42109

43110
/// Path to the file holding a gateway-minted sandbox JWT.
44111
///
45-
/// Set by the Docker, Podman, and VM drivers, which write the token to a
46-
/// bundle file at sandbox-create time. Read once at supervisor startup;
47-
/// the token is held in process memory thereafter.
112+
/// Set by Docker, Podman, and VM when [`SANDBOX_AUTH_MODE`] is
113+
/// [`SandboxAuthMode::GatewayManagedFile`] or
114+
/// [`SandboxAuthMode::GatewayManagedSupervisorPush`].
48115
pub const SANDBOX_TOKEN_FILE: &str = "OPENSHELL_SANDBOX_TOKEN_FILE";
49116

50117
/// Path to the projected `ServiceAccount` JWT (Kubernetes driver).
51118
///
52-
/// Used to bootstrap a gateway-minted JWT via `IssueSandboxToken`. Kubelet
53-
/// writes and rotates this file; the supervisor exchanges its contents
54-
/// for a gateway JWT at startup and on refresh.
119+
/// Used when [`SANDBOX_AUTH_MODE`] is
120+
/// [`SandboxAuthMode::KubernetesServiceAccountExchange`].
55121
pub const K8S_SA_TOKEN_FILE: &str = "OPENSHELL_K8S_SA_TOKEN_FILE";

crates/openshell-driver-docker/README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -80,6 +80,8 @@ overwrites security-critical keys:
8080
- `OPENSHELL_SANDBOX`
8181
- `OPENSHELL_SSH_SOCKET_PATH`
8282
- `OPENSHELL_SANDBOX_COMMAND`
83+
- `OPENSHELL_SANDBOX_AUTH_MODE=gateway-managed-file` and
84+
`OPENSHELL_SANDBOX_TOKEN_FILE` when gateway JWT auth is enabled
8385
- TLS path variables when HTTPS is enabled
8486

8587
Do not allow sandbox images or templates to override these values.

0 commit comments

Comments
 (0)