Skip to content

nvidia-container-toolkit: reduce missing-resource warnings#842

Open
arnaldo2792 wants to merge 1 commit intodevelopfrom
silence-nvidia-ctk/core-kit
Open

nvidia-container-toolkit: reduce missing-resource warnings#842
arnaldo2792 wants to merge 1 commit intodevelopfrom
silence-nvidia-ctk/core-kit

Conversation

@arnaldo2792
Copy link
Contributor

Issue number:

Closes #827

Description of changes:

The CDI spec generator logs warnings for missing host libraries/binaries. These warnings are misleading to downstreams, as they suggest initialization failures when none occurred.

The tool offers a --quiet flag, but it suppresses everything below error level, which also hides useful info-level logs needed for debugging.

Instead of using --quiet, downgrade the noisy missing-library warnings to debug level. This preserves info logs for diagnostics while eliminating the misleading warnings.

Testing done:

Details

Before:

bash-5.2# journalctl -u generate-cdi-specs.service | grep warn
Feb 25 02:22:21 ip-172-31-12-237.us-west-2.compute.internal nvidia-ctk[1774]: time="2026-02-25T02:22:21Z" level=warning msg="Failed to evaluate symlink /dev/dri/by-path/pci-0000:00:1e.0-card; ignoring"
Feb 25 02:22:21 ip-172-31-12-237.us-west-2.compute.internal nvidia-ctk[1774]: time="2026-02-25T02:22:21Z" level=warning msg="Failed to evaluate symlink /dev/dri/by-path/pci-0000:00:1e.0-render; ignoring"
Feb 25 02:22:21 ip-172-31-12-237.us-west-2.compute.internal nvidia-ctk[1774]: time="2026-02-25T02:22:21Z" level=warning msg="Could not locate libnvidia-egl-gbm.so.*.*: pattern libnvidia-egl-gbm.so.*.* not found\nlibnvidia-egl-gbm.so.*.*: not found"
Feb 25 02:22:21 ip-172-31-12-237.us-west-2.compute.internal nvidia-ctk[1774]: time="2026-02-25T02:22:21Z" level=warning msg="Could not locate libnvidia-egl-wayland.so.*.*: pattern libnvidia-egl-wayland.so.*.* not found\nlibnvidia-egl-wayland.so.*.*: not found"
Feb 25 02:22:21 ip-172-31-12-237.us-west-2.compute.internal nvidia-ctk[1774]: time="2026-02-25T02:22:21Z" level=warning msg="Could not locate libnvidia-vulkan-producer.so.580.126.09: pattern libnvidia-vulkan-producer.so.580.126.09 not found\nlibnvidia-vulkan-producer.so.580.126.09: not found"
Feb 25 02:22:21 ip-172-31-12-237.us-west-2.compute.internal nvidia-ctk[1774]: time="2026-02-25T02:22:21Z" level=warning msg="Could not locate nvidia_drv.so: pattern nvidia_
drv.so not found"
Feb 25 02:22:21 ip-172-31-12-237.us-west-2.compute.internal nvidia-ctk[1774]: time="2026-02-25T02:22:21Z" level=warning msg="Could not locate libglxserver_nvidia.so.580.126.09: pattern libglxserver_nvidia.so.580.126.09 not found"
Feb 25 02:22:21 ip-172-31-12-237.us-west-2.compute.internal nvidia-ctk[1774]: time="2026-02-25T02:22:21Z" level=warning msg="Could not locate X11/xorg.conf.d/10-nvidia.conf: pattern X11/xorg.conf.d/10-nvidia.conf not found"
Feb 25 02:22:21 ip-172-31-12-237.us-west-2.compute.internal nvidia-ctk[1774]: time="2026-02-25T02:22:21Z" level=warning msg="Could not locate X11/xorg.conf.d/nvidia-drm-outputclass.conf: pattern X11/xorg.conf.d/nvidia-drm-outputclass.conf not found"
Feb 25 02:22:21 ip-172-31-12-237.us-west-2.compute.internal nvidia-ctk[1774]: time="2026-02-25T02:22:21Z" level=warning msg="Could not locate /nvidia-fabricmanager/socket:pattern /nvidia-fabricmanager/socket not found"
Feb 25 02:22:21 ip-172-31-12-237.us-west-2.compute.internal nvidia-ctk[1774]: time="2026-02-25T02:22:21Z" level=warning msg="Could not locate /tmp/nvidia-mps: pattern /tmp/nvidia-mps not found"
Feb 25 02:22:21 ip-172-31-12-237.us-west-2.compute.internal nvidia-ctk[1774]: time="2026-02-25T02:22:21Z" level=warning msg="Could not locate nvidia-imex: pattern nvidia-imex not found"
Feb 25 02:22:21 ip-172-31-12-237.us-west-2.compute.internal nvidia-ctk[1774]: time="2026-02-25T02:22:21Z" level=warning msg="Could not locate nvidia-imex-ctl: pattern nvidia-imex-ctl not found"
Feb 25 02:22:21 ip-172-31-12-237.us-west-2.compute.internal nvidia-ctk[1774]: time="2026-02-25T02:22:21Z" level=warning msg="Could not locate nvidia_drv.so: pattern nvidia_drv.so not found"
Feb 25 02:22:21 ip-172-31-12-237.us-west-2.compute.internal nvidia-ctk[1774]: time="2026-02-25T02:22:21Z" level=warning msg="Could not locate libglxserver_nvidia.so.580.126.09: pattern libglxserver_nvidia.so.580.126.09 not found"
bash-5.2#
bash-5.2# journalctl -u generate-cdi-specs.service | grep warning
Feb 25 02:11:28 ip-192-168-25-16.us-west-2.compute.internal nvidia-ctk[2797]: time="2026-02-25T02:11:28Z" level=warning msg="Failed to evaluate symlink /dev/dri/by-path/pci-0000:31:00.0-card; ignoring"
Feb 25 02:11:28 ip-192-168-25-16.us-west-2.compute.internal nvidia-ctk[2797]: time="2026-02-25T02:11:28Z" level=warning msg="Failed to evaluate symlink /dev/dri/by-path/pci-0000:31:00.0-render; ignoring"
bash-5.2#

Terms of contribution:

By submitting this pull request, I agree that this contribution is dual-licensed under the terms of both the Apache License, version 2.0, and the MIT license.

The CDI spec generator logs warnings for missing host libraries/binaries.
These warnings are misleading to downstreams, as they suggest initialization
failures when none occurred.

The tool offers a --quiet flag, but it suppresses everything below error
level, which also hides useful info-level logs needed for debugging.

Instead of using --quiet, downgrade the noisy missing-library warnings to
debug level. This preserves info logs for diagnostics while eliminating the
misleading warnings.

Signed-off-by: Arnaldo Garcia Rincon <agarrcia@amazon.com>
Copy link
Contributor

@piyush-jena piyush-jena left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

qq: Is there a way to change logging level? Do I modify the service calling nvidia-ctk and pass log-level?

@arnaldo2792
Copy link
Contributor Author

nvidia-ctk provides the --quiet flag which changes the level from info to error. That doesn't help us because if we change the log level to error we will miss important info logs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Silence nvidia-ctk warnings

2 participants