fix: add device rules for AllowDevice.conf when running in legacy mode #1553
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
This PR fixes an issue where containers using the NVIDIA runtime in legacy mode fail after a
systemctl daemon-reloadis performed on the host.The root cause is that the
devices.allowcgroup file is not being updated with NVIDIA character device rules when containers are created in legacy mode. This causesnvidia-smiand other tools to fail with:Changes
NVIDIA_VISIBLE_DEVICESenvironment variable:nvidiactl,nvidia-uvm,nvidia-uvm-tools,nvidia-modeset/dev/nvidia0,/dev/nvidia1, etc. (based on requested GPUs)NVIDIA_VISIBLE_DEVICES=all, specific indices (0,1,2), and UUIDsRelated Issue(s)
Fixes opencontainers/runc#4859
How To Test
Reproducing the Issue (Before Fix)
Install the latest
nvidia-container-toolkitversionUpdate
/etc/nvidia-container-runtime/config.tomlto use legacy mode:docker run -d --runtime=nvidia \ -e NVIDIA_VISIBLE_DEVICES=0 or --gpus=all \ --name gpu-test \ nvidia/cuda:12.2.0-base-ubuntu22.04 sleep 30000nvidia-smiworks:nvidia-smiagain in the container:Testing the Fix
docker run -d --runtime=nvidia \ ---e NVIDIA_VISIBLE_DEVICES=0 or --gpus=all \ --name gpu-test-fixed \ nvidia/cuda:12.2.0-base-ubuntu22.04 sleep 30000nvidia-smiworks:daemon-reloadin another terminal:nvidia-smiagain:Rollback (If Needed)