fix(sandbox): restore GPU filesystem baseline#1522
Open
elezar wants to merge 2 commits into
Open
Conversation
|
🌿 Preview your docs: https://nvidia-preview-pr-1522.docs.buildwithfern.com/openshell |
6 tasks
Base automatically changed from
fix/1486-gpu-enrichment-no-network/elezar
to
main
May 27, 2026 08:20
Signed-off-by: Evan Lezar <elezar@nvidia.com>
96a1caa to
59e399a
Compare
17 tasks
Keep /proc out of the GPU filesystem baseline and allow only Landlock WriteFile access on procfs for GPU sandboxes. This lets CUDA update /proc/<pid>/task/<tid>/comm without promoting procfs to read-write in the policy. A more restrictive rule rooted at /proc/self/task is insufficient because CUDA workloads can spawn descendant processes after Landlock is enforced, and those descendants resolve /proc/self to their own process-specific subtree. Signed-off-by: Evan Lezar <elezar@nvidia.com>
12bde4d to
d73e6de
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Restore CUDA GPU filesystem access for Docker-backed GPU sandboxes without promoting all of
/procto full read-write policy access.This keeps the GPU device-node baseline from #1524, but handles CUDA procfs thread-name writes with a GPU-only Landlock
WriteFileexception on procfs. Non-GPU sandboxes do not receive this exception.Related Issue
Fixes #1486
Builds on the GPU no-network enrichment fix merged in #1524.
The no-network enrichment regression is handled in #1524 and was introduced by #158. This PR addresses the follow-up GPU procfs baseline regression introduced by #910, where explicit default read-only paths prevented GPU-required baseline handling.
The GPU workload test images used for validation come from #1484.
Changes
/procintofilesystem_policy.read_write;/proccan remain read-only in the policy.AccessFs::WriteFileunder/proc, and only when GPU devices are present in the sandbox.deviceQuery.Testing
/home/elezar/.local/bin/mise run pre-commit/home/elezar/.cargo/bin/cargo test -p openshell-sandbox --lib baseline_tests -- --nocapture/home/elezar/.cargo/bin/cargo test -p openshell-sandbox --lib landlock::tests -- --nocapture/home/elezar/.cargo/bin/cargo clippy -p openshell-sandbox --lib --tests -- -D warningsdocker run --rm --device nvidia.com/gpu=all localhost/openshell/gpu-workload-cuda-basic:bdaa08fb-dirtypassed withOPENSHELL_GPU_WORKLOAD_SUCCESS cuda-basicopenshell sandbox create --no-keep --from localhost/openshell/gpu-workload-cuda-basic:bdaa08fb-dirty --gpu -- /usr/local/bin/openshell-gpu-workloadpassed withOPENSHELL_GPU_WORKLOAD_SUCCESS cuda-basic/proc/self/taskprototype failed the same sandbox workload withcudaGetDeviceCount returned 304, confirming the need to cover descendant CUDA processes.Checklist