Skip to content

Commit d1ce9bb

Browse files
committed
chore: merge main into approval loop
Signed-off-by: Alexander Watson <zredlined@gmail.com>
2 parents abe84ef + f1ed347 commit d1ce9bb

55 files changed

Lines changed: 3789 additions & 477 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.agents/skills/build-from-issue/SKILL.md

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -148,7 +148,8 @@ In the prompt, instruct the reviewer to:
148148
- **Medium**: Multiple files/components, some design decisions, but well-scoped
149149
- **High**: Cross-cutting changes, architectural decisions needed, significant unknowns
150150
8. Call out risks, unknowns, and decisions that need stakeholder input.
151-
9. Assess **LSM compatibility** — if the change touches process identity, `/proc` filesystem access, binary execution, or inter-process visibility, flag whether it will behave differently on hosts running SELinux (enforcing) or AppArmor. In particular, tests that fork+exec into system binaries will fail on SELinux-enforcing hosts due to cross-label `/proc/<pid>/exe` access restrictions.
151+
9. Assess **gateway config documentation impact** — if the change adds, removes, renames, or changes defaults for gateway TOML keys or driver-specific config options, the plan must include an update to `docs/reference/gateway-config.mdx`. If the change is surfaced through Helm or a compute-driver overview, also include `docs/reference/sandbox-compute-drivers.mdx` or the relevant deployment docs.
152+
10. Assess **LSM compatibility** — if the change touches process identity, `/proc` filesystem access, binary execution, or inter-process visibility, flag whether it will behave differently on hosts running SELinux (enforcing) or AppArmor. In particular, tests that fork+exec into system binaries will fail on SELinux-enforcing hosts due to cross-label `/proc/<pid>/exe` access restrictions.
152153

153154
### A2: Post the Plan Comment
154155

@@ -436,6 +437,13 @@ Review the documentation requirements in `AGENTS.md` and update any affected
436437
docs as part of the implementation. Keep documentation changes scoped to the
437438
behavior or subsystem that changed.
438439

440+
If the implementation changes gateway TOML parsing, `[openshell.gateway]`
441+
fields, `[openshell.drivers.<name>]` fields, driver config defaults, or Helm
442+
rendering of `gateway.toml`, update `docs/reference/gateway-config.mdx` in the
443+
same branch. If the change affects user-facing compute-driver setup, also
444+
update `docs/reference/sandbox-compute-drivers.mdx` or the relevant deployment
445+
page.
446+
439447
### Step 12: Commit and Push
440448

441449
Commit all changes using conventional commit format. The `<type>` comes from the issue type in the plan:

.agents/skills/create-github-pr/SKILL.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,15 @@ Create pull requests on GitHub using the `gh` CLI.
1515

1616
## Before Creating a PR
1717

18+
### Check Config Documentation
19+
20+
If the branch changes gateway TOML parsing, `[openshell.gateway]` fields,
21+
`[openshell.drivers.<name>]` fields, driver config defaults, or Helm rendering
22+
of `gateway.toml`, verify that `docs/reference/gateway-config.mdx` is updated
23+
in the same branch. If the change affects user-facing compute-driver setup,
24+
also update `docs/reference/sandbox-compute-drivers.mdx` or the relevant
25+
deployment docs.
26+
1827
### Run Pre-commit Checks
1928

2029
Run the local pre-commit task before opening a PR:

.agents/skills/create-spike/SKILL.md

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -91,9 +91,11 @@ The prompt to the reviewer **must** instruct it to:
9191

9292
9. **Check architecture docs** in the `architecture/` directory for relevant documentation about the affected subsystems.
9393

94-
10. **Assess Linux Security Module (LSM) impact.** If the change involves process identity, `/proc` filesystem access, file labeling, binary execution, or inter-process visibility, call out whether it will behave differently on hosts running SELinux (enforcing) or AppArmor. For example: reading `/proc/<pid>/exe` across an SELinux domain boundary returns ENOENT, not EACCES. Tests that fork+exec into system binaries (different SELinux label) will fail on enforcing hosts. Flag any LSM-sensitive code paths and recommend mitigations.
94+
10. **Assess gateway config documentation impact.** If the change would add, remove, rename, or change defaults for gateway TOML keys or driver-specific config options, call out that `docs/reference/gateway-config.mdx` must be updated. If the change is surfaced through Helm or compute-driver setup docs, call out the relevant deployment or compute-driver docs too.
9595

96-
11. **Determine the issue type:** `feat`, `fix`, `refactor`, `chore`, `perf`, or `docs`.
96+
11. **Assess Linux Security Module (LSM) impact.** If the change involves process identity, `/proc` filesystem access, file labeling, binary execution, or inter-process visibility, call out whether it will behave differently on hosts running SELinux (enforcing) or AppArmor. For example: reading `/proc/<pid>/exe` across an SELinux domain boundary returns ENOENT, not EACCES. Tests that fork+exec into system binaries (different SELinux label) will fail on enforcing hosts. Flag any LSM-sensitive code paths and recommend mitigations.
97+
98+
12. **Determine the issue type:** `feat`, `fix`, `refactor`, `chore`, `perf`, or `docs`.
9799

98100
### What makes a good investigation prompt
99101

.agents/skills/update-docs/SKILL.md

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,7 @@ git log -50 --oneline --no-merges
3434
Filter to commits that are likely to affect docs. Look for these signals:
3535

3636
1. **Commit type**: `feat`, `fix`, `refactor`, `perf` commits often change behavior. `docs` commits are already doc changes. `chore`, `ci`, `test` commits rarely need doc updates.
37-
2. **Files changed**: Changes to `crates/openshell-cli/`, `python/`, `proto/`, `deploy/`, or policy-related code are high-signal.
37+
2. **Files changed**: Changes to `crates/openshell-cli/`, `python/`, `proto/`, `deploy/`, gateway config parsing, driver config structs, or policy-related code are high-signal.
3838
3. **Ignore**: Changes limited to `tests/`, `e2e/`, `.github/`, `tasks/`, or internal-only modules.
3939

4040
```bash
@@ -52,6 +52,10 @@ For each relevant commit, determine which doc page(s) it affects. Use this mappi
5252
| `crates/openshell-cli/` (sandbox commands) | `docs/sandboxes/manage-sandboxes.mdx` |
5353
| `crates/openshell-cli/` (provider commands) | `docs/sandboxes/manage-providers.mdx` |
5454
| `crates/openshell-cli/` (new top-level command) | May need a new page or `docs/reference/` entry |
55+
| `crates/openshell-server/src/config_file.rs` or gateway TOML parsing | `docs/reference/gateway-config.mdx` |
56+
| `crates/openshell-server/src/cli.rs` gateway config merge/default behavior | `docs/reference/gateway-config.mdx` |
57+
| `crates/openshell-driver-*/` config structs or driver defaults | `docs/reference/gateway-config.mdx`, `docs/reference/sandbox-compute-drivers.mdx` |
58+
| `deploy/helm/openshell/templates/gateway-config.yaml` | `docs/reference/gateway-config.mdx`, `docs/reference/sandbox-compute-drivers.mdx`, Helm docs if values change |
5559
| Proxy or policy code | `docs/sandboxes/policies.mdx`, `docs/reference/policy-schema.mdx` |
5660
| Inference code | `docs/inference/configure.mdx` |
5761
| `python/` (SDK changes) | `docs/reference/` or `docs/get-started/quickstart.mdx` |

.github/actions/release-helm-oci/action.yml

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -71,6 +71,14 @@ runs:
7171
exit 1
7272
fi
7373
74+
- name: Build chart dependencies
75+
env:
76+
CHART_DIR: ${{ steps.prep.outputs.chart_dir }}
77+
shell: bash
78+
run: |
79+
set -euo pipefail
80+
helm dependency build "${CHART_DIR}"
81+
7482
- name: Package Helm chart
7583
env:
7684
CHART_DIR: ${{ steps.prep.outputs.chart_dir }}

.github/workflows/e2e-test.yml

Lines changed: 40 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -40,6 +40,10 @@ jobs:
4040
- suite: rust-podman
4141
cmd: "mise run --no-deps --skip-deps e2e:podman"
4242
apt_packages: "openssh-client podman"
43+
- suite: rust-podman-rootless
44+
cmd: "mise run --no-deps --skip-deps e2e:podman:rootless"
45+
apt_packages: "openssh-client podman uidmap"
46+
rootless: true
4347
container:
4448
image: ghcr.io/nvidia/openshell/ci:latest
4549
credentials:
@@ -72,14 +76,48 @@ jobs:
7276
run: echo "${{ secrets.GITHUB_TOKEN }}" | docker login ghcr.io -u "${{ github.actor }}" --password-stdin
7377

7478
- name: Log in to GHCR with Podman
75-
if: matrix.suite == 'rust-podman'
79+
if: startsWith(matrix.suite, 'rust-podman')
7680
run: echo "${{ secrets.GITHUB_TOKEN }}" | podman login ghcr.io -u "${{ github.actor }}" --password-stdin
7781

82+
- name: Set up rootless Podman user
83+
if: matrix.rootless
84+
run: |
85+
useradd -m openshell-test
86+
echo "openshell-test:100000:65536" >> /etc/subuid
87+
echo "openshell-test:100000:65536" >> /etc/subgid
88+
mkdir -p "/run/user/$(id -u openshell-test)"
89+
chown openshell-test: "/run/user/$(id -u openshell-test)"
90+
chmod 700 "/run/user/$(id -u openshell-test)"
91+
chown -R openshell-test: .
92+
for dir in /root/.cargo /root/.rustup /root/.local/share/mise /opt/mise; do
93+
[ -d "$dir" ] && chmod -R a+rX "$dir"
94+
done
95+
7896
- name: Install Python dependencies and generate protobuf stubs
7997
if: matrix.suite == 'python'
8098
run: uv sync --frozen && mise run --no-deps python:proto
8199

82100
- name: Run tests
83101
env:
84102
OPENSHELL_SUPERVISOR_IMAGE: ${{ format('ghcr.io/nvidia/openshell/supervisor:{0}', inputs.image-tag) }}
85-
run: ${{ matrix.cmd }}
103+
E2E_CMD: ${{ matrix.cmd }}
104+
run: |
105+
if [ "${{ matrix.rootless }}" = "true" ]; then
106+
TESTUID="$(id -u openshell-test)"
107+
runuser -u openshell-test -- env \
108+
XDG_RUNTIME_DIR="/run/user/${TESTUID}" \
109+
HOME="/home/openshell-test" \
110+
PATH="/root/.cargo/bin:/opt/mise/shims:/opt/mise/bin:${PATH}" \
111+
CARGO_HOME="/root/.cargo" \
112+
RUSTUP_HOME="/root/.rustup" \
113+
OPENSHELL_SUPERVISOR_IMAGE="${OPENSHELL_SUPERVISOR_IMAGE}" \
114+
OPENSHELL_REGISTRY="${OPENSHELL_REGISTRY}" \
115+
OPENSHELL_REGISTRY_HOST="${OPENSHELL_REGISTRY_HOST}" \
116+
OPENSHELL_REGISTRY_USERNAME="${OPENSHELL_REGISTRY_USERNAME}" \
117+
OPENSHELL_REGISTRY_PASSWORD="${OPENSHELL_REGISTRY_PASSWORD}" \
118+
IMAGE_TAG="${IMAGE_TAG}" \
119+
MISE_GITHUB_TOKEN="${MISE_GITHUB_TOKEN}" \
120+
bash -c "${E2E_CMD}"
121+
else
122+
${E2E_CMD}
123+
fi

.gitignore

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -187,6 +187,9 @@ rootfs/
187187
# Docker build artifacts (image tarballs, packaged helm charts)
188188
deploy/docker/.build/
189189

190+
# Helm subchart tarballs (regenerated by `helm dependency build`)
191+
deploy/helm/openshell/charts/
192+
190193
# SBOM generated output (JSON, CSV) — release artifacts, not committed
191194
deploy/sbom/output/
192195

.python-version

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
3.13.12
1+
3.14.5

AGENTS.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -190,6 +190,7 @@ ocsf_emit!(event);
190190

191191
- When making changes, update the relevant documentation in the `architecture/` directory.
192192
- When changes affect user-facing behavior, update the relevant published docs pages under `docs/` and navigation in `docs/index.yml`.
193+
- When changing gateway TOML fields, driver-specific config options, config defaults, or Helm rendering of `gateway.toml`, update `docs/reference/gateway-config.mdx` in the same branch.
193194
- `fern/` contains the Fern site config, components, preview workflow inputs, and publish settings.
194195
- Follow the docs style guide in [docs/CONTRIBUTING.mdx](docs/CONTRIBUTING.mdx): active voice, minimal formatting, no filler introductions, `shell` fences for copyable commands, and no duplicate body H1.
195196
- Fern PR previews run through `.github/workflows/branch-docs.yml`, and production publish runs through the `publish-fern-docs` job in `.github/workflows/release-tag.yml`.

crates/openshell-cli/src/run.rs

Lines changed: 15 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -1888,11 +1888,11 @@ pub async fn sandbox_create(
18881888
.into_diagnostic()?
18891889
.into_inner();
18901890

1891-
let mut last_phase = sandbox.phase;
1891+
let mut last_phase = sandbox.phase();
18921892
let mut last_error_reason = String::new();
18931893
let mut last_condition_message = ready_false_condition_message(sandbox.status.as_ref());
18941894
// Track whether we have seen a non-Ready phase during the watch.
1895-
let mut saw_non_ready = SandboxPhase::try_from(sandbox.phase) != Ok(SandboxPhase::Ready);
1895+
let mut saw_non_ready = SandboxPhase::try_from(sandbox.phase()) != Ok(SandboxPhase::Ready);
18961896
let provision_timeout = Duration::from_secs(
18971897
std::env::var("OPENSHELL_PROVISION_TIMEOUT")
18981898
.ok()
@@ -1945,8 +1945,8 @@ pub async fn sandbox_create(
19451945
let evt = item.into_diagnostic()?;
19461946
match evt.payload {
19471947
Some(openshell_core::proto::sandbox_stream_event::Payload::Sandbox(s)) => {
1948-
let phase = SandboxPhase::try_from(s.phase).unwrap_or(SandboxPhase::Unknown);
1949-
last_phase = s.phase;
1948+
let phase = SandboxPhase::try_from(s.phase()).unwrap_or(SandboxPhase::Unknown);
1949+
last_phase = s.phase();
19501950
if let Some(message) = ready_false_condition_message(s.status.as_ref()) {
19511951
last_condition_message = Some(message);
19521952
}
@@ -2507,7 +2507,7 @@ pub async fn sandbox_get(
25072507
};
25082508
println!(" {} {}", "Id:".dimmed(), id);
25092509
println!(" {} {}", "Name:".dimmed(), name);
2510-
println!(" {} {}", "Phase:".dimmed(), phase_name(sandbox.phase));
2510+
println!(" {} {}", "Phase:".dimmed(), phase_name(sandbox.phase()));
25112511
println!(
25122512
" {} {}",
25132513
"Resource version:".dimmed(),
@@ -2591,11 +2591,11 @@ pub async fn sandbox_exec_grpc(
25912591
.ok_or_else(|| miette::miette!("sandbox not found"))?;
25922592

25932593
// Verify the sandbox is ready before issuing the exec.
2594-
if SandboxPhase::try_from(sandbox.phase) != Ok(SandboxPhase::Ready) {
2594+
if SandboxPhase::try_from(sandbox.phase()) != Ok(SandboxPhase::Ready) {
25952595
return Err(miette::miette!(
25962596
"sandbox '{}' is not ready (phase: {}); wait for it to reach Ready state",
25972597
name,
2598-
phase_name(sandbox.phase)
2598+
phase_name(sandbox.phase())
25992599
));
26002600
}
26012601

@@ -2803,11 +2803,11 @@ async fn fetch_ready_sandbox_for_forward(
28032803
.sandbox
28042804
.ok_or_else(|| miette::miette!("sandbox '{name}' not found"))?;
28052805

2806-
if SandboxPhase::try_from(sandbox.phase) != Ok(SandboxPhase::Ready) {
2806+
if SandboxPhase::try_from(sandbox.phase()) != Ok(SandboxPhase::Ready) {
28072807
return Err(miette::miette!(
28082808
"sandbox '{}' is no longer ready (phase: {}); stopping service forward",
28092809
name,
2810-
phase_name(sandbox.phase)
2810+
phase_name(sandbox.phase())
28112811
));
28122812
}
28132813

@@ -3251,8 +3251,8 @@ pub async fn sandbox_list(
32513251

32523252
// Print rows
32533253
for sandbox in sandboxes {
3254-
let phase = phase_name(sandbox.phase);
3255-
let phase_colored = match SandboxPhase::try_from(sandbox.phase) {
3254+
let phase = phase_name(sandbox.phase());
3255+
let phase_colored = match SandboxPhase::try_from(sandbox.phase()) {
32563256
Ok(SandboxPhase::Ready) => phase.green().to_string(),
32573257
Ok(SandboxPhase::Error) => phase.red().to_string(),
32583258
Ok(SandboxPhase::Provisioning) => phase.yellow().to_string(),
@@ -3280,8 +3280,8 @@ fn sandbox_to_json(sandbox: &Sandbox) -> serde_json::Value {
32803280
"labels": labels,
32813281
"resource_version": meta.map_or(0, |m| m.resource_version),
32823282
"created_at": format_epoch_ms(meta.map_or(0, |m| m.created_at_ms)),
3283-
"phase": phase_name(sandbox.phase),
3284-
"current_policy_version": sandbox.current_policy_version,
3283+
"phase": phase_name(sandbox.phase()),
3284+
"current_policy_version": sandbox.current_policy_version(),
32853285
})
32863286
}
32873287

@@ -7727,15 +7727,14 @@ mod tests {
77277727
let status = SandboxStatus {
77287728
sandbox_name: "gpu".to_string(),
77297729
agent_pod: "gpu-pod".to_string(),
7730-
agent_fd: String::new(),
7731-
sandbox_fd: String::new(),
77327730
conditions: vec![SandboxCondition {
77337731
r#type: "Ready".to_string(),
77347732
status: "False".to_string(),
77357733
reason: "Unschedulable".to_string(),
77367734
message: "Another GPU sandbox may already be using the available GPU.".to_string(),
77377735
last_transition_time: String::new(),
77387736
}],
7737+
..Default::default()
77397738
};
77407739

77417740
assert_eq!(
@@ -7749,15 +7748,14 @@ mod tests {
77497748
let status = SandboxStatus {
77507749
sandbox_name: "gpu".to_string(),
77517750
agent_pod: "gpu-pod".to_string(),
7752-
agent_fd: String::new(),
7753-
sandbox_fd: String::new(),
77547751
conditions: vec![SandboxCondition {
77557752
r#type: "Scheduled".to_string(),
77567753
status: "True".to_string(),
77577754
reason: "Scheduled".to_string(),
77587755
message: "Sandbox scheduled".to_string(),
77597756
last_transition_time: String::new(),
77607757
}],
7758+
..Default::default()
77617759
};
77627760

77637761
assert!(ready_false_condition_message(Some(&status)).is_none());

0 commit comments

Comments
 (0)