Rebase NVIDIA QEMU Virtualization Features onto v11.0.0#16
Open
JiandiAnNVIDIA wants to merge 58 commits into
Open
Rebase NVIDIA QEMU Virtualization Features onto v11.0.0#16JiandiAnNVIDIA wants to merge 58 commits into
JiandiAnNVIDIA wants to merge 58 commits into
Conversation
QEMU smmuv3 accel does not support live migration yet, so dirty tracking for the nesting parent HWPT is not useful. Also, nested vIOMMU use cases can break on some platforms. For example, SMMUv3 with HTTU may advertise dirty tracking capability, but the kernel supports it only for stage-1. Requesting dirty tracking for a nesting parent HWPT (stage-2) can fail. Add a vIOMMU flag to explicitly request dirty tracking for the nesting parent HWPT. For nested cases, dirty tracking is enabled only when requested by the vIOMMU. Non-nested cases and Intel vIOMMU keep the existing behavior. Fixes: fc6dafb ("hw/arm/smmuv3: Implement get_viommu_cap() callback") Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Link: https://lore.kernel.org/qemu-devel/20260401084133.56266-1-skolothumtho@nvidia.com Signed-off-by: Cédric Le Goater <clg@redhat.com> (cherry picked from commit 659275f84694e7b06d67d877137905d371a3fde4) Signed-off-by: Jiandi An <jan@nvidia.com>
By giving smmuv3_accel_init() the ability to populate an error, we can fail early in smmu_realize() when CONFIG_ARM_SMMUV3_ACCEL is not available, simplifying smmu_validate_property(). Suggested-by: Shameer Kolothum Thodi <skolothumtho@nvidia.com> Co-developed-by: Shameer Kolothum Thodi <skolothumtho@nvidia.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Shameer Kolothum <skolothumtho@nvidia.com> Message-Id: <20260410200031.18572-2-philmd@linaro.org> (cherry picked from commit 4c7fefc2d043a66f799cf2c2e34ed680b1b44b5c)
…ameters Introduce smmuv3_accel_auto_finalise() to resolve properties that are set to 'auto' for accelerated SMMUv3. This helper function allows properties such as ats, ril, ssidsize, and oas support to be resolved from host IOMMU capabilities via IOMMU_GET_HW_INFO. The later commits in this series set the auto_mode flag to true when an accel SMMUv3 property value is explicitly set to 'auto', or if the property value is not set and defaults to auto mode. Setting these property values to 'auto' requires at least one cold-plugged device to retrieve and finalise these properties. If the auto_mode flag is true, register a machine_init_done notifier to verify this requirement and fail boot if it is not met. Hot-plugged devices into an accel SMMUv3-associated bus will re-use the resolved host values from the initial cold-plug. Subsequent patches will make use of this helper to resolve 'auto' to what is reported by host IOMMU capabilities. Suggested-by: Shameer Kolothum <skolothumtho@nvidia.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Nathan Chen <nathanc@nvidia.com> (backported from https://lore.kernel.org/all/20260512193520.3109172-1-nathanc@nvidia.com/) Signed-off-by: Jiandi An <jan@nvidia.com>
Allow accelerated SMMUv3 Address Translation Services support property to be derived from host IOMMU capabilities. Derive host values using IOMMU_GET_HW_INFO, retrieving ATS capability from IDR0. Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Nathan Chen <nathanc@nvidia.com> (backported from https://lore.kernel.org/all/20260512193520.3109172-1-nathanc@nvidia.com/) Signed-off-by: Jiandi An <jan@nvidia.com>
Allow accelerated SMMUv3 Range Invalidation support property to be derived from host IOMMU capabilities. Derive host values using IOMMU_GET_HW_INFO, retrieving RIL capability from IDR3. Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Nathan Chen <nathanc@nvidia.com> (backported from https://lore.kernel.org/all/20260512193520.3109172-1-nathanc@nvidia.com/) Signed-off-by: Jiandi An <jan@nvidia.com>
…ize" Allow accelerated SMMUv3 SSID size property to be derived from host IOMMU capabilities. Derive host values using IOMMU_GET_HW_INFO, retrieving SSID size from IDR1. When the auto SSID size is resolved to a non-zero value, PASID capability is advertised to the vIOMMU and accelerated use cases such as Shared Virtual Addressing (SVA) are supported. Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Nathan Chen <nathanc@nvidia.com> (backported from https://lore.kernel.org/all/20260512193520.3109172-1-nathanc@nvidia.com/) Signed-off-by: Jiandi An <jan@nvidia.com>
Allow accelerated SMMUv3 OAS property to be derived from host IOMMU capabilities. Derive host values using IOMMU_GET_HW_INFO, retrieving OAS from IDR5. This keeps the OAS value advertised by the virtual SMMU compatible with the capabilities of the host SMMUv3, so that the intermediate physical addresses (IPA) consumed by host SMMU for stage-2 translation do not exceed the host's max supported IPA size. Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Nathan Chen <nathanc@nvidia.com> (backported from https://lore.kernel.org/all/20260512193520.3109172-1-nathanc@nvidia.com/) Signed-off-by: Jiandi An <jan@nvidia.com>
…auto Set the default value of ATS, RIL, SSIDSIZE, and OAS to auto, in order to match the host IOMMU properties when accel=on. If accel=off and these property values are set to auto, the default property values defined in smmuv3_init_id_regs() for OAS and RIL will remain unchanged, while SSIDSIZE and ATS values will remain initialized at 0. Introduce a new compat for the changed defaults. Signed-off-by: Nathan Chen <nathanc@nvidia.com> (backported from https://lore.kernel.org/all/20260512193520.3109172-1-nathanc@nvidia.com/) [jan: manually apply machine.c changes — hw_compat_11_0 array does not exist on v11.0.0 base; added array, include, and boards.h declaration] Signed-off-by: Jiandi An <jan@nvidia.com>
…rties Update documentation now that "auto" is supported for accelerated SMMUv3 properties. Signed-off-by: Nathan Chen <nathanc@nvidia.com> (backported from https://lore.kernel.org/all/20260512193520.3109172-1-nathanc@nvidia.com/) Signed-off-by: Jiandi An <jan@nvidia.com>
The updated IOMMUFD uAPI introduces the ability for userspace to request a specific hardware info data type via IOMMU_GET_HW_INFO. Update iommufd_backend_get_device_info() to set IOMMU_HW_INFO_FLAG_INPUT_TYPE when a non-zero type is supplied, and adjust all callers to pass a type value explicitly initialised to zero (IOMMU_HW_INFO_TYPE_DEFAULT) when no specific type is requested. Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com> (backported from https://lore.kernel.org/qemu-devel/20260415105552.622421-1-skolothumtho@nvidia.com/) [jan: resolve conflict due to 659275f846 (Control dirty tracking for nesting parent HWPT) adding viommu_nesting vars shifting context in iommufd_cdev_autodomains_get()] Signed-off-by: Jiandi An <jan@nvidia.com>
…to allow user ptr The updated IOMMUFD VIOMMU_ALLOC uAPI allows userspace to provide a data buffer when creating a vIOMMU (e.g. for Tegra241 CMDQV). Extend iommufd_backend_alloc_viommu() to pass a user pointer and size to the kernel. Update the caller accordingly. Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com> (backported from https://lore.kernel.org/qemu-devel/20260415105552.622421-1-skolothumtho@nvidia.com/) Signed-off-by: Jiandi An <jan@nvidia.com>
…ueue Add a helper to allocate an iommufd backed HW queue for a vIOMMU. While at it, define a struct IOMMUFDHWqueue for use by vendor implementations. Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com> (backported from https://lore.kernel.org/qemu-devel/20260415105552.622421-1-skolothumtho@nvidia.com/) Signed-off-by: Jiandi An <jan@nvidia.com>
Add a backend helper to mmap hardware MMIO regions exposed via iommufd for a vIOMMU instance. This allows user space to access HW-accelerated MMIO pages provided by the vIOMMU. The caller is responsible for unmapping the returned region. Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com> (backported from https://lore.kernel.org/qemu-devel/20260415105552.622421-1-skolothumtho@nvidia.com/) Signed-off-by: Jiandi An <jan@nvidia.com>
…UFDVeventq The viommu field is assigned but never used. Callers freeing the veventq already have access to the IOMMUFDViommu object through other references, so this field is redundant. Removing it also simplifies upcoming changes where veventq is allocated based on the viommu id before the IOMMUFDViommu object is created (e.g. vendor CMDQV-based veventq allocation). No functional change. Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> (backported from https://lore.kernel.org/qemu-devel/20260415105552.622421-1-skolothumtho@nvidia.com/) Signed-off-by: Jiandi An <jan@nvidia.com>
Command Queue Virtualization (CMDQV) is a hardware extension available on certain platforms that allows the SMMUv3 command queue to be virtualized and passed through to a VM, improving performance. For example, NVIDIA Tegra241 implements CMDQV to support virtualization of multiple command queues (VCMDQs). The term CMDQV is used here generically to refer to any platform that provides hardware support to virtualize the SMMUv3 command queue. CMDQV support is a specialization of the IOMMUFD-backed accelerated SMMUv3 path. Introduce an ops interface to factor out CMDQV-specific probe, initialization, and vIOMMU allocation logic from the base implementation. The ops pointer and associated state are stored in the accelerated SMMUv3 state. This provides an extensible design to support future vendor-specific CMDQV implementations. No functional change. Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> (backported from https://lore.kernel.org/qemu-devel/20260415105552.622421-1-skolothumtho@nvidia.com/) [jan: resolve conflict due to 79fcbec (Introduce smmuv3 accel device) adding CONFIG_DEVICES include shifting context in smmuv3-accel.h] Signed-off-by: Jiandi An <jan@nvidia.com>
…stub Introduce a Tegra241 CMDQV backend that plugs into the SMMUv3 accelerated CMDQV ops interface. This patch wires up the Tegra241 CMDQV backend and provides a stub implementation for CMDQV probe, initialization, vIOMMU allocation and reset handling. Functional CMDQV support is added in follow-up patches. Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> (backported from https://lore.kernel.org/qemu-devel/20260415105552.622421-1-skolothumtho@nvidia.com/) [jan: resolve conflict due to 79fcbec (Introduce smmuv3 accel device) changing arm_common_ss to arm_ss for smmuv3 entries in hw/arm/meson.build] Signed-off-by: Jiandi An <jan@nvidia.com>
Add support for selecting and initializing a CMDQV backend based on the cmdqv OnOffAuto property. If set to OFF, CMDQV is not used and the default IOMMUFD-backed allocation path is taken. If set to AUTO, QEMU attempts to probe a CMDQV backend during device setup. If probing succeeds, the selected ops are stored in the accelerated SMMUv3 state and used. If probing fails, QEMU silently falls back to the default path. If set to ON, QEMU requires CMDQV support. Probing is performed during setup and failure results in an error. When a CMDQV backend is active, its callbacks are used for vIOMMU allocation, free, and reset handling. Otherwise, the base implementation is used. The current implementation wires up the Tegra241 CMDQV backend through the generic ops interface. Functional CMDQV behaviour is added in subsequent patches. No functional change. Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> (backported from https://lore.kernel.org/qemu-devel/20260415105552.622421-1-skolothumtho@nvidia.com/) Signed-off-by: Jiandi An <jan@nvidia.com>
Introduce a GPtrArray in VirtMachineState to track all SMMUv3 devices created on the virt machine, and use it when building the IORT table instead of relying on object_child_foreach_recursive() walks of the object tree. This avoids recursive object traversal and provides a foundation for subsequent patches that need direct access to SMMUv3 instances for CMDQV-related handling. No functional change. No bios-tables qtest failures observed. Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> (backported from https://lore.kernel.org/qemu-devel/20260415105552.622421-1-skolothumtho@nvidia.com/) Signed-off-by: Jiandi An <jan@nvidia.com>
Use IOMMU_GET_HW_INFO to query host support for Tegra241 CMDQV. Validate the returned data type, version, and minimum number of vCMDQs and SIDs per Tegra241 CMDQ Virtual Interface(VI). Fail the probe if the host does not meet these requirements. The QEMU model supports one Virtual Interface(VI) per VM with 2 vCMDQs and 16 SIDs per VI, so the probe ensures the host implementation is compatible with these limits. Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> (backported from https://lore.kernel.org/qemu-devel/20260415105552.622421-1-skolothumtho@nvidia.com/) Signed-off-by: Jiandi An <jan@nvidia.com>
Tegra241 CMDQV extends SMMUv3 with support for virtual command queues (VCMDQs) exposed via a CMDQV MMIO region. The CMDQV MMIO space is split into 64KB pages: 0x00000 (CMDQ-V Config page) 0x10000 (CMDQ-V CMDQ Page0) 0x20000 (CMDQ-V CMDQ Page1) 0x30000 (Virtual Interface Page0) 0x40000 (Virtual Interface Page1) This patch wires up the Tegra241 CMDQV init callback and allocates vendor-specific CMDQV state. The state pointer is stored in SMMUv3AccelState for use by subsequent CMDQV operations. The CMDQV MMIO region and a dedicated IRQ line are registered with the SMMUv3 device. The MMIO read/write handlers are currently stubs and will be implemented in later patches. The CMDQV interrupt is edge-triggered and indicates VCMDQ or VINTF error conditions. This patch only registers the IRQ line. Interrupt generation and propagation to the guest will be added in a subsequent patch. Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Co-developed-by: Shameer Kolothum <skolothumtho@nvidia.com> Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com> (backported from https://lore.kernel.org/qemu-devel/20260415105552.622421-1-skolothumtho@nvidia.com/) Signed-off-by: Jiandi An <jan@nvidia.com>
SMMUv3 devices with acceleration may enable CMDQV extensions after device realize. In that case, additional MMIO regions and IRQ lines may be registered but not yet mapped to the platform bus. Ensure SMMUv3 device resources are linked to the platform bus during machine_done(). This is safe to do unconditionally since the platform bus helpers skip resources that are already mapped. Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> (backported from https://lore.kernel.org/qemu-devel/20260415105552.622421-1-skolothumtho@nvidia.com/) Signed-off-by: Jiandi An <jan@nvidia.com>
Replace the stub implementation with real vIOMMU allocation for Tegra241 CMDQV. Allocate a matching vEVENTQ together with the vIOMMU, since it is specific to the Tegra241 CMDQV vIOMMU and used to receive CMDQV events. Free both objects on teardown. Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Co-developed-by: Shameer Kolothum <skolothumtho@nvidia.com> Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> (backported from https://lore.kernel.org/qemu-devel/20260415105552.622421-1-skolothumtho@nvidia.com/) Signed-off-by: Jiandi An <jan@nvidia.com>
Tegra241 CMDQV exposes control and status registers in the CMDQ-V Config page (offset [0x0, 0x10000)) used to configure virtual command queue allocation and interrupt behavior. Add read/write emulation for the CMDQ-V Config region ([CMDQV_BASE, CMDQV_CMDQ_BASE]), backed by a simple register cache. This includes CONFIG, PARAM, STATUS, VI error and interrupt maps, CMDQ allocation map and the VINTF0 related registers defined in the CMDQ-V Config space. Only VINTF0 is supported; VINTF1-63 are not. Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com> (backported from https://lore.kernel.org/qemu-devel/20260415105552.622421-1-skolothumtho@nvidia.com/) Signed-off-by: Jiandi An <jan@nvidia.com>
Tegra241 CMDQV exposes per-VCMDQ register windows through two MMIO apertures: CMDQV_CMDQ_BASE (0x10000/0x20000): VCMDQ Page0/Page1 CMDQV_VI_CMDQ_BASE (0x30000/0x40000): VINTF VCMDQ Page0/Page1 VINTF Page0 (0x30000) and VCMDQ Page0 (0x10000) are hardware aliases addressing the same underlying registers. Add read emulation for both apertures, backed by a register cache. VINTF Page0 reads are translated to their VCMDQ Page0 equivalent and served from the same cached state. Once IOMMU_HW_QUEUE_ALLOC and viommu_mmap are wired up in a subsequent patch, Page0 register reads will be served directly from the hardware backed mmap'd page instead of the cache. Page1 registers are always served from cache. Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com> (backported from https://lore.kernel.org/qemu-devel/20260415105552.622421-1-skolothumtho@nvidia.com/) Signed-off-by: Jiandi An <jan@nvidia.com>
This is the write side counterpart of the VCMDQ read emulation. Add write handling for both CMDQV_CMDQ_BASE and CMDQV_VI_CMDQ_BASE apertures using the same index decoding and VINTF-to-VCMDQ translation logic as the read path. VINTF aperture writes are translated to their CMDQV_CMDQ_BASE equivalent and update the same cached state. Page1 registers (BASE, CONS_INDX_BASE) always update the cache. Once IOMMU_HW_QUEUE_ALLOC and viommu_mmap are wired up in a subsequent patch, Page0 register writes will be forwarded to the hardware-backed mmap'd page. Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com> (backported from https://lore.kernel.org/qemu-devel/20260415105552.622421-1-skolothumtho@nvidia.com/) Signed-off-by: Jiandi An <jan@nvidia.com>
The CMDQ-V CMDQ pages provide a VM wide view of all VCMDQs, while the VINTF pages expose a logical view local to a given VINTF. Although real hardware may support multiple VINTFs, the kernel currently exposes a single VINTF per VM. The kernel provides an mmap offset for the VINTF Page0 region during vIOMMU allocation. However, the logical-to-physical association between VCMDQs and a VINTF is only established after HW_QUEUE allocation. Prior to that, the mapped Page0 does not back any real VCMDQ state. When VINTF is enabled, mmap the kernel provided Page0 region and set ENABLE_OK only if the mmap succeeds. Unmap it when VINTF is disabled. This prepares the VINTF mapping in advance of subsequent patches that add VCMDQ allocation support. Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Co-developed-by: Shameer Kolothum <skolothumtho@nvidia.com> Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com> (backported from https://lore.kernel.org/qemu-devel/20260415105552.622421-1-skolothumtho@nvidia.com/) Signed-off-by: Jiandi An <jan@nvidia.com>
Introduce address_space_is_ram(), a helper to determine whether a guest physical address resolves to a RAM-backed MemoryRegion within an AddressSpace. Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> (backported from https://lore.kernel.org/qemu-devel/20260415105552.622421-1-skolothumtho@nvidia.com/) Signed-off-by: Jiandi An <jan@nvidia.com>
…ster programming Add support for allocating IOMMUFD hardware queues when the guest programs the VCMDQ BASE registers. VCMDQ_EN is part of the VCMDQ_CONFIG register, which is accessed through the VINTF Page0 region. A subsequent patch maps this region directly into the guest address space, so QEMU does not trap writes to VCMDQ_CONFIG. Since VCMDQ_EN writes are not trapped, QEMU cannot allocate the hardware queue based on that bit. Instead, allocate the IOMMUFD hardware queue when the guest writes a VCMDQ BASE register with a valid RAM-backed address and when CMDQV and VINTF are enabled. If a hardware queue was previously allocated for the same VCMDQ, free it before reallocation. Writes with invalid addresses are ignored. All allocated VCMDQs are freed when CMDQV or VINTF is disabled. Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com> (backported from https://lore.kernel.org/qemu-devel/20260415105552.622421-1-skolothumtho@nvidia.com/) Signed-off-by: Jiandi An <jan@nvidia.com>
… backing Introduce tegra241_cmdqv_vintf_ptr() to route VCMDQ register accesses through the mmap'd VINTF page0 backing once a hardware queue has been allocated. There are two QEMU trapped MMIO apertures for VCMDQ registers: - Direct VCMDQ aperture (offset 0x10000) - VINTF Page0 (offset 0x30000) These are hardware aliases: they address the same underlying registers. A subsequent patch maps the VINTF aperture as a guest-direct RAM region; in this patch both remain QEMU-trapped. VCMDQ register accesses operate in one of two mutually exclusive modes, depending on whether a hardware queue (IOMMU_HW_QUEUE_ALLOC) has been allocated for the VCMDQ: Pre-alloc: vintf_ptr is NULL. Both apertures use QEMU's register cache. Hardware is not yet engaged; Post-alloc: vintf_ptr is valid. Both QEMU trapped apertures access registers directly via the mmap'd vintf_page0 pointer, bypassing the cache. Hardware is the single source of truth. The pre-to-post-alloc transition is triggered by the BASE register write that initiates IOMMU_HW_QUEUE_ALLOC. No cache-to-hardware synchronisation is needed at transition time. The hardware mandated init sequence requires BASE to be written first; PROD_INDX, CONS_INDX and CONFIG.CMDQ_EN are programmed only after BASE and are therefore always post-alloc. Any pre-alloc writes to those registers update only the register cache, which is discarded at the transition. CMDQV acceleration only becomes active once the guest enables VINTF and programs the VCMDQ BASE register. Until then, all VCMDQ accesses are served from the emulated register cache with no real hardware command processing. This matches the CMDQV hardware specification: if the logical CMDQ index does not map to any allocated Virtual CMDQ, "the access is dropped with no Fault/Interrupt". Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com> (backported from https://lore.kernel.org/qemu-devel/20260415105552.622421-1-skolothumtho@nvidia.com/) Signed-off-by: Jiandi An <jan@nvidia.com>
Some RAM device regions created with memory_region_init_ram_device_ptr() are not intended to be P2P DMA targets. The VFIO listener currently treats all RAM device regions as DMA capable and attempts to map them into the IOMMU. For regions without dma-buf backing this fails and prints warnings such as: IOMMU_IOAS_MAP failed: Bad address, PCI BAR? Introduce a MemoryRegion flag (ram_device_skip_iommu_map) to mark RAM device regions that should not be IOMMU mapped. When set, the VFIO listener skips DMA mapping for that region. Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com> (backported from https://lore.kernel.org/qemu-devel/20260415105552.622421-1-skolothumtho@nvidia.com/) Signed-off-by: Jiandi An <jan@nvidia.com>
… space Once a VCMDQ is allocated, map the mmap'd vintf_page0 region directly into the guest-visible MMIO space at offset 0x30000 as a RAM-backed MemoryRegion. This eliminates QEMU trapping for hot-path CONS/PROD index updates. After this patch, the two VCMDQ apertures use different access paths: the direct aperture (0x10000) remains QEMU-trapped and writes via vintf_ptr, while the VI aperture (0x30000) is a direct guest RAM mapping. Both paths write to the same underlying vintf_page0 memory, so no synchronisation between the apertures is needed. The mapping is installed lazily on first successful VCMDQ hardware queue allocation and removed when CMDQV or VINTF is disabled. Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Co-developed-by: Shameer Kolothum <skolothumtho@nvidia.com> Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com> (backported from https://lore.kernel.org/qemu-devel/20260415105552.622421-1-skolothumtho@nvidia.com/) Signed-off-by: Jiandi An <jan@nvidia.com>
…tq read Move the vEVENTQ read and validation logic into a common helper smmuv3_accel_event_read_validate(). The helper performs the read(), checks for overflow and short reads, validates the sequence number, and updates the sequence state. This helper can be reused for Tegra241 CMDQV vEVENTQ support in a subsequent patch. Error handling is slightly adjusted: instead of reporting errors directly in the read handler, the helper now returns errors via Error **. Sequence gaps are reported as warnings. Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> (backported from https://lore.kernel.org/qemu-devel/20260415105552.622421-1-skolothumtho@nvidia.com/) [jan: resolve conflict due to v11.0.0 using inline stubs in smmuv3-accel.h instead of separate smmuv3-accel-stubs.c; added declaration and inline stub for smmuv3_accel_event_read_validate()] Signed-off-by: Jiandi An <jan@nvidia.com>
…QV errors Install an event handler on the CMDQV vEVENTQ fd to read and propagate host received CMDQV errors to the guest. The handler runs in QEMU’s main loop, using a non-blocking fd registered via qemu_set_fd_handler(). Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> (backported from https://lore.kernel.org/qemu-devel/20260415105552.622421-1-skolothumtho@nvidia.com/) Signed-off-by: Jiandi An <jan@nvidia.com>
Introduce a reset handler for the Tegra241 CMDQV and initialize its register state. Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com> (backported from https://lore.kernel.org/qemu-devel/20260415105552.622421-1-skolothumtho@nvidia.com/) Signed-off-by: Jiandi An <jan@nvidia.com>
…nd page size CMDQV HW reads guest queue memory in its host physical address setup via IOMMUFD. This requires the guest queue memory is not only contiguous in guest PA space but also in host PA space. With Tegra241 CMDQV enabled, we must only advertise a CMDQS that the host can safely back with physically contiguous memory. Allowing a queue larger than the host page size could cause the hardware to DMA across page boundaries, leading to faults. Walk the RAMBlock list to find the smallest memory-backend page size, then limit IDR1.CMDQS so the guest cannot configure a command queue that exceeds that contiguous backing. Fall back to the real host page size if no memory-backend RAM blocks are found. Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com> (backported from https://lore.kernel.org/qemu-devel/20260415105552.622421-1-skolothumtho@nvidia.com/) Signed-off-by: Jiandi An <jan@nvidia.com>
Add an "identifier" property to the SMMUv3 device and use it when building the ACPI IORT SMMUv3 node Identifier field. This avoids relying on device enumeration order and provides a stable per-device identifier. A subsequent patch will use the same identifier when generating the DSDT description for Tegra241 CMDQV, ensuring that the IORT and DSDT entries refer to the same SMMUv3 instance. The identifier is assigned at pre-plug time, accounting for the ITS Group node that build_iort() places before SMMUv3 nodes in the IORT table, so that identifiers are globally unique across all IORT nodes. No functional change: IORT blob content for bios-tables qtest is identical to before. Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> (backported from https://lore.kernel.org/qemu-devel/20260415105552.622421-1-skolothumtho@nvidia.com/) Signed-off-by: Jiandi An <jan@nvidia.com>
Introduce a SMMUv3AccelCmdqvType enum and a helper to query the CMDQV implementation type associated with an accelerated SMMUv3 instance. A subsequent patch will use this helper when generating the Tegra241 CMDQV DSDT. Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> (backported from https://lore.kernel.org/qemu-devel/20260415105552.622421-1-skolothumtho@nvidia.com/) [jan: resolve conflict — keep bool smmuv3_accel_init(Error **errp) signature from cherry-picked 4c7fefc2d0; add smmuv3_accel_cmdqv_type() before it] Signed-off-by: Jiandi An <jan@nvidia.com>
Add ACPI DSDT support for Tegra241 CMDQV when the SMMUv3 instance is created with tegra241-cmdqv. The SMMUv3 device identifier is used as the ACPI _UID. This matches the Identifier field of the corresponding SMMUv3 IORT node, allowing the CMDQV DSDT device to be correctly associated with its SMMU. Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Co-developed-by: Shameer Kolothum <skolothumtho@nvidia.com> Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com> (backported from https://lore.kernel.org/qemu-devel/20260415105552.622421-1-skolothumtho@nvidia.com/) Signed-off-by: Jiandi An <jan@nvidia.com>
…MDQV is active When CMDQV is active, the first cold-plugged VFIO device establishes the viommu to host SMMUv3 association. Block its hot-unplug to preserve this association and the guest's boot time CMDQV configuration. Also abort at machine_done if cmdqv=on is requested but no cold-plugged VFIO device was present to initialize it. Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> (backported from https://lore.kernel.org/qemu-devel/20260415105552.622421-1-skolothumtho@nvidia.com/) [jan: resolve conflict — 950e2c91c8 moved smmuv3_machine_done to smmuv3-accel.c; added unplug_blocker field and cmdqv check in correct file] Signed-off-by: Jiandi An <jan@nvidia.com>
Introduce a "cmdqv" property to enable Tegra241 CMDQV support. This is only enabled for accelerated SMMUv3 devices. Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com> (backported from https://lore.kernel.org/qemu-devel/20260415105552.622421-1-skolothumtho@nvidia.com/) Signed-off-by: Jiandi An <jan@nvidia.com>
During creation of the VM's SRAT table, the generic initiator entries
are added. Currently the order in the entries are not controllable from
the qemu command. This is due to the fact that the code queries the
object tree which may not be in the order objects were inserted.
As a fix the patch maintains a GPtrArray of generic initiator objects
that preserves their insertion order. Objects are automatically added
to the array when initialized and removed when finalized. When building
the SRAT table, objects are processed in the order they were first
inserted.
E.g. for the following qemu command.
...
-object acpi-generic-initiator,id=gi0,pci-dev=dev0,node=2 \
-object acpi-generic-initiator,id=gi1,pci-dev=dev0,node=3 \
-object acpi-generic-initiator,id=gi2,pci-dev=dev0,node=4 \
-object acpi-generic-initiator,id=gi3,pci-dev=dev0,node=5 \
-object acpi-generic-initiator,id=gi4,pci-dev=dev0,node=6 \
-object acpi-generic-initiator,id=gi5,pci-dev=dev0,node=7 \
-object acpi-generic-initiator,id=gi6,pci-dev=dev0,node=8 \
-object acpi-generic-initiator,id=gi7,pci-dev=dev0,node=9 \
...
Original PXM in the VM SRAT table:
[1A4h 0420 004h] Proximity Domain : 00000007
[1C4h 0452 004h] Proximity Domain : 00000006
[1E4h 0484 004h] Proximity Domain : 00000005
[204h 0516 004h] Proximity Domain : 00000004
[224h 0548 004h] Proximity Domain : 00000003
[244h 0580 004h] Proximity Domain : 00000009
[264h 0612 004h] Proximity Domain : 00000002
[284h 0644 004h] Proximity Domain : 00000008
[2A2h 0674 004h] Proximity Domain : 00000009
After the patch (preserves insertion order):
[1A4h 0420 004h] Proximity Domain : 00000002
[1C4h 0452 004h] Proximity Domain : 00000003
[1E4h 0484 004h] Proximity Domain : 00000004
[204h 0516 004h] Proximity Domain : 00000005
[224h 0548 004h] Proximity Domain : 00000006
[244h 0580 004h] Proximity Domain : 00000007
[264h 0612 004h] Proximity Domain : 00000008
[284h 0644 004h] Proximity Domain : 00000009
cc: Shameer Kolothum <skolothumtho@nvidia.com>
Fixes: 0a5b5ac ("hw/acpi: Implement the SRAT GI affinity structure")
(backported from https://lore.kernel.org/all/20260223112236.000065aa@huawei.com/)
[ankita: ML links to discussion and not patch as the original ML posting was lost]
Signed-off-by: Ankit Agrawal <ankita@nvidia.com>
Acked-by: Matthew R. Ochs <mochs@nvidia.com>
Acked-by: Mitchell Augustin <mitchell.augustin@canonical.com>
Signed-off-by: Matthew R. Ochs <mochs@nvidia.com>
(cherry picked from commit 70ac94e)
Signed-off-by: Jiandi An <jan@nvidia.com>
The Extended GPU Memory (EGM) feature [1] enables the GPU access to the local or remote system memory across sockets and nodes. In this mode, the physical memory can be allocated for GPU usage from anywhere in a multi-node system. The feature is being extended to virtualization. The CPU node with the EGM is associated with the GPUs present on the same socket in a way that the EGM node information such as its base physical address, length and the proximity domain ID is populated in the ACPI DSDT entries of those associated GPUs. This information is needed by the NVIDIA driver in the VM to discover its local EGM memory. The CPU memory being utilized as EGM is exposed as a memory-backend-file /dev/egmX backed by the nvgrace-egm module. To link the GPU devices to the CPU EGM node, a new qom object acpi-egm-memory is introduced. This helps Qemu populate the DSDT entries to the appropriate GPU device. An admin can provide this association as following. In the example, the NUMA node 0 has the EGM memory created through the /dev/egm4 device. This node is linked with the dev0 GPU device using the acpi-egm-memory object. ... -numa node,memdev=m0,cpus=0-3,nodeid=0 \ -object memory-backend-file,id=m0,mem-path=/dev/egm4,size=84G,share=on,prealloc=on \ -device vfio-pci-nohotplug,host=0008:01:00.0,bus=pcie.0,rombar=0,id=dev0 \ -object acpi-egm-memory,id=egm0,pci-dev=dev0,node=0 ... Link: https://developer.nvidia.com/blog/nvidia-grace-hopper-superchip-architecture-in-depth/#extended_gpu_memory [1] Signed-off-by: Ankit Agrawal <ankita@nvidia.com> (cherry picked from commit e647ae7 https://github.com/nvmochs/QEMU/tree/stable101_smmuv3-accel-07212025_egm) Signed-off-by: Matthew R. Ochs <mochs@nvidia.com> (cherry picked from commit 4954345) [jan: resolve conflict in hw/acpi/meson.build — trivial context, added new EGM source files to list] Signed-off-by: Jiandi An <jan@nvidia.com>
The Qemu code builds the ACPI DSDT for the VM devices. The Extended GPU Memory (EGM) information such as physical address, length and proximity domain ID is populated in the DSDT entries of the GPU devices present in the same socket as the EGM memory. This is used by the VM NVIDIA driver to determine the EGM properties. The GPU device is linked with the EGM memory node through the acpi-egm-memory object. While building ACPI tables, go through all of the egm-memory objects. Find the device and the EGM NUMA node association from the objects. Patch the DSDT to create the GPU device entries and populate with the corresponding NUMA node properties with DSDT object. Signed-off-by: Ankit Agrawal <ankita@nvidia.com> (cherry picked from commit a5eae1e https://github.com/nvmochs/QEMU/tree/stable101_smmuv3-accel-07212025_egm) Signed-off-by: Matthew R. Ochs <mochs@nvidia.com> (cherry picked from commit b62790f) Signed-off-by: Jiandi An <jan@nvidia.com>
…ror pages on EGM The nvgrace-egm module expose the list of pages with uncorrected ECC errors (referred as bad pages) in EGM memory through a EGM_BAD_PAGES_LIST ioctl on the EGM char device. Since these pages should not be accessed by the VM OS, they need to be kept absent from the VM memory map. This is achieved by leveraging the memory region in DTBs. Fetch the list of the pages by calling the ioctl and sort. The memory regions are built using this list by terminating a memory region at the physical address of a bad page. The next region is started from the next page, essentially skipping over the bad page. Also a minor code organization to move the fdt_add_memory_node to a wrapper that checks if the provided memory region length is non-zero before adding in DTB. Signed-off-by: Ankit Agrawal <ankita@nvidia.com> (cherry picked from commit dba314b https://github.com/nvmochs/QEMU/tree/stable101_smmuv3-accel-07212025_egm) Signed-off-by: Matthew R. Ochs <mochs@nvidia.com> (cherry picked from commit 2dd5ad8) [jan: resolve conflict — fdt_add_pmem_node added in v11.0.0 at same location as EGM's fdt_add_memory_node_wrapper; kept both] Signed-off-by: Jiandi An <jan@nvidia.com>
Signed-off-by: Ankit Agrawal <ankita@nvidia.com> (cherry picked from commit 3f183b4 https://github.com/nvmochs/QEMU/tree/stable101_smmuv3-accel-07212025_egm) Signed-off-by: Matthew R. Ochs <mochs@nvidia.com> (cherry picked from commit 09d8e48) Signed-off-by: Jiandi An <jan@nvidia.com>
The Extended GPU Memory (EGM) feature [1] enables the GPU access to the local or remote system memory across sockets and nodes. In this mode, the physical memory can be allocated for GPU usage from anywhere in a multi-node system. The feature is being extended to virtualization. The EGM memory is exposed as a memory-backend-file backed by the nvgrace-egm module. The EGM node information such as the physical address, length and the proximity domain ID is populated in the ACPI DSDT entries for the GPU devices present on the same physical socket. A new qom object acpi-egm-memory is introduced to link the GPU devices to the EGM node. It is possible for the EGM memory to have memory pages with ECC errors, and such list is fetched using an ioctl provided by the nvgrace-egm module. VM DTB memory regions are built using the list by skipping such pages. The VM OS is thus prevented from using those pages. Link: https://developer.nvidia.com/blog/nvidia-grace-hopper-superchip-architecture-in-depth/#extended_gpu_memory [1] Ankit Agrawal (3): qom: New object to associate device to EGM node hw/acpi: Populate DSD tables with EGM properties hw/arm/boot: Create DTB memory regions skipping ECC error pages on EGM hw/acpi/acpi_egm_memory.c | 176 ++++++++++++++++++++++++++++++ hw/acpi/meson.build | 1 + hw/arm/boot.c | 129 ++++++++++++++++++++-- hw/pci-host/gpex-acpi.c | 5 + include/hw/acpi/acpi_egm_memory.h | 24 ++++ linux-headers/linux/egm.h | 20 ++++ qapi/qom.json | 17 +++ 7 files changed, 363 insertions(+), 9 deletions(-) create mode 100644 hw/acpi/acpi_egm_memory.c create mode 100644 include/hw/acpi/acpi_egm_memory.h create mode 100644 linux-headers/linux/egm.h -- 2.34.1 Signed-off-by: Matthew R. Ochs <mochs@nvidia.com> (cherry picked from commit fab2732 https://github.com/nvmochs/QEMU/tree/stable101_smmuv3-accel-07212025_egm) Signed-off-by: Matthew R. Ochs <mochs@nvidia.com> (cherry picked from commit 32db1b7) Signed-off-by: Jiandi An <jan@nvidia.com>
There is an assert in build_append_nameseg() that enforces the length of ACPI segments to 4 characters: qemu-system-aarch64: hw/acpi/aml-build.c:202: build_append_nameseg: Assertion len <= ACPI_NAMESEG_LEN failed. This assert can be hit when building the EGM DSDT entries because the gpu_id can overflow a single character. To avoid hitting this assert, always reset the gpu_id before building the DSDT table. Fixes: b62790f ("NVIDIA: SAUCE: hw/acpi: Populate DSDT with EGM properties") Signed-off-by: Matthew R. Ochs <mochs@nvidia.com> Acked-by: Mitchell Augustin <mitchell.augustin@canonical.com> Acked-by: Ankit Agrawal <ankita@nvidia.com> Signed-off-by: Matthew R. Ochs <mochs@nvidia.com> (cherry picked from commit 338211a) Signed-off-by: Jiandi An <jan@nvidia.com>
Add debian folder from Ubuntu Noble QEMU (noble-updates branch): https://git.launchpad.net/ubuntu/+source/qemu/log/?h=ubuntu/noble-updates Tag: import/1:8.2.2+ds-0ubuntu1.16 HEAD commit: b6ab27191 1:8.2.2+ds-0ubuntu1.16 (patches unapplied) Signed-off-by: Jiandi An <jan@nvidia.com>
debian/control: debian/control-in: - Added dependency support for meson-1.5 - Added NVIDIA as maintainer debian/patches/*: debian/patches/series: - Remove all Debian/Ubuntu patches debian/rules: - Removed pvrdma and cris/nios2 architectures (removed since v8.2) - Disabled firmware builds debian/qemu-system-common.install: - Remove obsolete files (removed since v8.2) - Added hw-uefi-vars.so (added since v8.2) Signed-off-by: Matthew R. Ochs <mochs@nvidia.com> (cherry picked from commit 36fe037) [jan: resolve conflict in debian/patches/series — 11.0 noble-updates base has different patches than 10.1; emptied series and deleted all patch files as intended] Signed-off-by: Jiandi An <jan@nvidia.com>
QEMU is particularly challenging to package due to its multiple subprojects that ship binary files. To simplify and speed up maintenance, allow binaries to be included in the source package. Signed-off-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Matthew R. Ochs <mochs@nvidia.com> (cherry picked from commit 0b8d533) Signed-off-by: Jiandi An <jan@nvidia.com>
This fixes a build failure in PPA environments where dh_missing reports DTB files exist in debian/tmp but are not installed to any package. QEMU 11.0's Meson build has a bug where it ignores the install_blobs=false option when dtc is not found, causing DTB files to be auto-installed to usr/share/qemu/dtb/. This creates inconsistent behavior between local builds (with dtc installed) and PPA builds (without dtc). Changes: 1. Add qemu-system-data.install with PPC DTBs (bamboo, canyonlands) 2. Add qemu-system-misc.install with MicroBlaze DTBs (petalogix-ml605, petalogix-s3adsp1800) 3. Update debian/rules install-misc paths to usr/share/qemu/dtb/ 4. Patch pc-bios/dtb/meson.build to disable dtc detection, forcing use of pre-built files 5. Update debian/rules build-misc to copy pre-built DTBs instead of compiling 6. Uncomment sysdata-components += misc to enable manual DTB installation This ensures both local (with dtc) and PPA (without dtc) builds behave identically by always using pre-built DTB files from pc-bios/dtb/. Signed-off-by: Matthew R. Ochs <mochs@nvidia.com> (cherry picked from commit bcf1711) Signed-off-by: Jiandi An <jan@nvidia.com>
Introduce debian/build-deb-from-git.sh to automate the creation of the orig tarball from the Git repository for packaging purposes. Signed-off-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Matthew R. Ochs <mochs@nvidia.com> (cherry picked from commit 0104065) [jan: change reference of 10.1.0 to 11.0.0, update error message to suggest cleaning the main repo too in addition to just submodules, add missing Python wheels (wheel, setuptools, pip) to orig targall for offline builds] Signed-off-by: Jiandi An <jan@nvidia.com>
The --enable-avx2 option was removed in QEMU 10.1. This option no longer exists in the configure script, causing the xen build to fail with: ERROR: unknown option --enable-avx2 Ubuntu's QEMU 10.1 packaging (questing) also removed this option from the xen build configuration. This fixes the amd64 PPA build failure in the xen configuration stage. Signed-off-by: Matthew R. Ochs <mochs@nvidia.com> (cherry picked from commit 11ec431) Signed-off-by: Jiandi An <jan@nvidia.com>
The microvm build target creates the binary directly in b/microvm/qemu-system-x86_64, not in a subdirectory. The install-microvm target was looking in the wrong location: Wrong: b/microvm/x86_64-softmmu/qemu-system-x86_64 Correct: b/microvm/qemu-system-x86_64 This matches Ubuntu's QEMU 10.1 packaging and fixes the amd64 build failure: cp: cannot stat 'b/microvm/x86_64-softmmu/qemu-system-x86_64': No such file or directory Signed-off-by: Matthew R. Ochs <mochs@nvidia.com> (cherry picked from commit 93e625e) Signed-off-by: Jiandi An <jan@nvidia.com>
qemu-vmsr-helper is a VMware-specific tool new in QEMU 10.1 that provides VMware VMSR (Virtual Machine Service Registry) compatibility features. Since this is not needed for NVIDIA QEMU, we exclude it from the package by adding it to debian/not-installed. This fixes the dh_missing error: usr/bin/qemu-vmsr-helper exists in debian/tmp but is not installed to anywhere Signed-off-by: Matthew R. Ochs <mochs@nvidia.com> (cherry picked from commit 3b4f945) Signed-off-by: Jiandi An <jan@nvidia.com>
Adds a GitHub Action pipeline to automatically build a source package that can be signed and uploaded to an LP builder Signed-off-by: Mitchell Augustin <mitchell.augustin@canonical.com> Signed-off-by: Matthew R. Ochs <mochs@nvidia.com> (cherry picked from commit 73ecd2d) Signed-off-by: Jiandi An <jan@nvidia.com>
Signed-off-by: Matthew R. Ochs <mochs@nvidia.com> (cherry picked from commit 5dcef2c) Signed-off-by: Jiandi An <jan@nvidia.com>
Update to QEMU v11.0.0 upstream with NVIDIA support Signed-off-by: Jiandi An <jan@nvidia.com>
d298f44 to
a85eb88
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
This PR ports the NVIDIA QEMU virtualization feature stack to QEMU v11.0.0, rebasing all out-of-tree patches onto the new upstream release. QEMU v11.0.0 already includes vSMMUv3, DMABUF, vEVENTQ, hugepfnmap, SMMUv3 AUTO properties, significantly reducing the out-of-tree patch count compared to the nvidia_stable-10.1 branch. The remaining items are ported as cherry-picks or backports from upstream and the mailing list, plus Ubuntu Noble packaging for v11.0.
Key Features:
Source
Patch Breakdown (58 commits):
upstream/master(merged, ETA v11.1)smmuv3_accel_init()Error* parameter refactorupstream/master(prerequisite for item 3)nvidia_stable-10.1)nvidia_stable-10.1nvidia_stable-10.1Notes on items already in v11.0.0 (items 1-5 from porting plan):
The following series are included in the upstream v11.0.0 release and require no additional patches:
Notes on dirty tracking (item 1):
Cherry-picked from upstream commit
659275f84694:upstream/masterbut did not make the v11.0.0 release (ETA v11.1)Notes on SMMUv3 Resolve AUTO properties v3 (item 3):
7 patches backported from v3 posting. An additional prerequisite commit (
4c7fefc2d0by Philippe Mathieu-Daudé) was cherry-picked fromupstream/masterto provide thebool smmuv3_accel_init(Error **errp)signature that v3 depends on.Conflict resolutions were required for patches 1-5 due to the v11.0.0 base having a
#ifndef CONFIG_ARM_SMMUV3_ACCELguard insmmu_validate_property()that the upstream v3 patches do not expect (removed by the Philippe prerequisite commit). Patch 6 required manual application ofhw/core/machine.cchanges to add thehw_compat_11_0array, which does not exist on the v11.0.0 base.Notes on Tegra241 CMDQV v4 (item 4):
31 patches backported from v4 posting. Conflict resolutions were required for 3 patches due to the Resolve AUTO properties v3 series changing the
smmuv3_accel_initsignature and addingsmmuv3_machine_done(), which shifted context inhw/arm/smmuv3-accel.c.Notes on packaging (item 7):
Ubuntu Noble debian packaging base copied from
ubuntu/noble-updates(8.2.2+ds-0ubuntu1.16), then NVIDIA-specific packaging fixes cherry-picked fromnvidia_stable-10.1. Two 10.1-specific commits were skipped as no-ops on 11.0:1:10.1.0+nvidia*)Lore Links:
Control dirty tracking for nesting parent HWPT:
https://lore.kernel.org/all/20260401084133.56266-1-skolothumtho@nvidia.com/
SMMUv3 Resolve AUTO properties v3 (7 patches):
https://lore.kernel.org/all/20260512193520.3109172-1-nathanc@nvidia.com/
Tegra241 CMDQV v4 (31 patches):
https://lore.kernel.org/qemu-devel/20260415105552.622421-1-skolothumtho@nvidia.com/
ACPI generic initiator insertion order v2:
https://lore.kernel.org/all/20260222091700-mutt-send-email-mst@kernel.org/
Philippe Mathieu-Daudé smmuv3_accel_init Error* parameter (prerequisite):
https://lore.kernel.org/qemu-devel/20260410200031.18572-2-philmd@linaro.org/
Upstream Status:
smmuv3_accel_initError* refactor (Philippe)Testing
noble)Notes
nvidia-11.0-rebase-2026-05-06based on tagv11.0.0hw_compat_11_0array was manually added tohw/core/machine.candinclude/hw/core/boards.has it does not exist on the v11.0.0 base (it is present onupstream/masterfor post-11.0 development)nvidia_stable-10.1; no upstream path exists for these