This page outlines the recommended operating system optimizations for the GB200 Compute Tray to maximize performance and efficiency when running containerized workloads.
Note:
Once the OS optimizations are applied at the system level, the standard deployment of Kubernetes and NVIDIA Operators can proceed without any changes to their configuration.
NVIDIA Grace Performance Tuning Guide - Operating System Settings:
https://docs.nvidia.com/grace-perf-tuning-guide/os-settings.html#operating-system-settings
The GB200 Compute Tray must be powered using one of the following supported configurations:
- Ubuntu 22.04 with the NVIDIA 64k kernel
- Ubuntu 24.04 with the NVIDIA 64k kernel
When the CNS Ansible Playbook detects that it is running on the GB200 platform, the following OS optimizations are automatically applied (see the reference provided in the Useful Links section):
- Init on Alloc
- Input-Output Memory Management Unit Passthrough
- Automatic NUMA Scheduling and Balancing
These optimizations are applied through the following boot configuration parameters:
$ cat /proc/cmdline
BOOT_IMAGE=/vmlinuz-6.8.0-1039-nvidia-64k root=/dev/mapper/ubuntu--vg-ubuntu--lv ro init_on_alloc=0 numa_balancing=disable iommu.passthrough=1
The OS Optimizations are applied for all types of CNS Ansible Playbook deployments (CNS for developers or standard CNS)