fix(sandbox): restore GPU filesystem baseline by elezar · Pull Request #1522 · NVIDIA/OpenShell

elezar · 2026-05-22T13:47:40Z

Summary

Restore CUDA GPU filesystem access for Docker-backed GPU sandboxes without promoting all of /proc to full read-write policy access.

This keeps the GPU device-node baseline from #1524, but handles CUDA procfs thread-name writes with a GPU-only Landlock WriteFile exception on procfs. Non-GPU sandboxes do not receive this exception.

Related Issue

Fixes #1486

Builds on the GPU no-network enrichment fix merged in #1524.
The no-network enrichment regression is handled in #1524 and was introduced by #158. This PR addresses the follow-up GPU procfs baseline regression introduced by #910, where explicit default read-only paths prevented GPU-required baseline handling.
The GPU workload test images used for validation come from #1484.

Changes

Keep injected NVIDIA/WSL GPU device nodes in the GPU read-write baseline.
Stop promoting /proc into filesystem_policy.read_write; /proc can remain read-only in the policy.
Add a Linux Landlock runtime exception that grants only AccessFs::WriteFile under /proc, and only when GPU devices are present in the sandbox.
Cover descendant CUDA processes, such as a shell workload script that later starts deviceQuery.
Preserve custom-policy conflicts for injected GPU device nodes that are incorrectly kept read-only.
Update GPU sandbox policy documentation to describe the narrower procfs behavior.

Testing

/home/elezar/.local/bin/mise run pre-commit
/home/elezar/.cargo/bin/cargo test -p openshell-sandbox --lib baseline_tests -- --nocapture
/home/elezar/.cargo/bin/cargo test -p openshell-sandbox --lib landlock::tests -- --nocapture
/home/elezar/.cargo/bin/cargo clippy -p openshell-sandbox --lib --tests -- -D warnings
Plain Docker control: docker run --rm --device nvidia.com/gpu=all localhost/openshell/gpu-workload-cuda-basic:bdaa08fb-dirty passed with OPENSHELL_GPU_WORKLOAD_SUCCESS cuda-basic
Docker-backed OpenShell sandbox: openshell sandbox create --no-keep --from localhost/openshell/gpu-workload-cuda-basic:bdaa08fb-dirty --gpu -- /usr/local/bin/openshell-gpu-workload passed with OPENSHELL_GPU_WORKLOAD_SUCCESS cuda-basic
A narrower /proc/self/task prototype failed the same sandbox workload with cudaGetDeviceCount returned 304, confirming the need to cover descendant CUDA processes.

Checklist

Follows Conventional Commits
Commits are signed off (DCO)
Architecture/docs updated (if applicable)

github-actions · 2026-05-22T13:48:05Z

🌿 Preview your docs: https://nvidia-preview-pr-1522.docs.buildwithfern.com/openshell

Signed-off-by: Evan Lezar <elezar@nvidia.com>

Keep /proc out of the GPU filesystem baseline and allow only Landlock WriteFile access on procfs for GPU sandboxes. This lets CUDA update /proc/<pid>/task/<tid>/comm without promoting procfs to read-write in the policy. A more restrictive rule rooted at /proc/self/task is insufficient because CUDA workloads can spawn descendant processes after Landlock is enforced, and those descendants resolve /proc/self to their own process-specific subtree. Signed-off-by: Evan Lezar <elezar@nvidia.com>

elezar requested review from a team, derekwaynecarr, maxamillion and mrunalp as code owners May 22, 2026 13:47

elezar mentioned this pull request May 22, 2026

fix(sandbox): decouple GPU baseline from network policy #1524

Merged

6 tasks

elezar changed the base branch from main to fix/1486-gpu-enrichment-no-network/elezar May 22, 2026 14:06

Base automatically changed from fix/1486-gpu-enrichment-no-network/elezar to main May 27, 2026 08:20

fix(sandbox): restore GPU proc baseline

59e399a

Signed-off-by: Evan Lezar <elezar@nvidia.com>

elezar force-pushed the fix/1486-gpu-sandbox-filesystem-policy/elezar branch from 96a1caa to 59e399a Compare May 27, 2026 09:02

elezar mentioned this pull request May 28, 2026

feat(gpu): derive sandbox access requirements from CDI specs #1606

Open

17 tasks

elezar force-pushed the fix/1486-gpu-sandbox-filesystem-policy/elezar branch from 12bde4d to d73e6de Compare May 28, 2026 19:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(sandbox): restore GPU filesystem baseline#1522

fix(sandbox): restore GPU filesystem baseline#1522
elezar wants to merge 2 commits into
mainfrom
fix/1486-gpu-sandbox-filesystem-policy/elezar

elezar commented May 22, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

elezar commented May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Related Issue

Changes

Testing

Checklist

Uh oh!

github-actions Bot commented May 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

elezar commented May 22, 2026 •

edited

Loading