Skip to content

Rebase NVIDIA QEMU Virtualization Features onto v11.0.0#16

Open
JiandiAnNVIDIA wants to merge 58 commits into
NVIDIA:nvidia_stable-11.0from
JiandiAnNVIDIA:nvidia_stable-11.0-virt
Open

Rebase NVIDIA QEMU Virtualization Features onto v11.0.0#16
JiandiAnNVIDIA wants to merge 58 commits into
NVIDIA:nvidia_stable-11.0from
JiandiAnNVIDIA:nvidia_stable-11.0-virt

Conversation

@JiandiAnNVIDIA
Copy link
Copy Markdown
Collaborator

Description

This PR ports the NVIDIA QEMU virtualization feature stack to QEMU v11.0.0, rebasing all out-of-tree patches onto the new upstream release. QEMU v11.0.0 already includes vSMMUv3, DMABUF, vEVENTQ, hugepfnmap, SMMUv3 AUTO properties, significantly reducing the out-of-tree patch count compared to the nvidia_stable-10.1 branch. The remaining items are ported as cherry-picks or backports from upstream and the mailing list, plus Ubuntu Noble packaging for v11.0.

Key Features:

  • QEMU v11.0.0 base with vSMMUv3 v9, DMABUF v4, vEVENTQ v8, hugepfnmap v4, SMMUv3 AUTO properties v5
  • Dirty tracking control for nesting parent HWPT (merged upstream, missed v11.0 cut)
  • SMMUv3 Resolve AUTO properties v3 — resolves AUTO values for ats, ril, ssidsize, oas at device attach time
  • Tegra241 CMDQV support — full CMDQ-Virtualization backend for accelerated SMMUv3, including VINTF mmap, VCMDQ emulation, and DSDT advertisement
  • ACPI generic initiator insertion order fix
  • EGM support for virtualization (GPU ECC memory, DSDT integration)
  • Ubuntu Noble debian packaging adapted for v11.0.0 builds

Source

Patch Breakdown (58 commits):

# Category Count Source
1 Control dirty tracking for nesting parent HWPT 1 Cherry-pick from upstream/master (merged, ETA v11.1)
2 smmuv3_accel_init() Error* parameter refactor 1 Cherry-pick from upstream/master (prerequisite for item 3)
3 SMMUv3 Resolve AUTO properties v3 7 Backport from mailing list (v3 posted)
4 Tegra241 CMDQV support v4 31 Backport from mailing list (v4 posted)
5 ACPI generic initiator insertion order v2 1 Backport from mailing list (v2 posted)
6 EGM support (pick from nvidia_stable-10.1) 6 Cherry-pick from nvidia_stable-10.1
7 Ubuntu Noble debian packaging 10 Noble-updates base + cherry-picks from nvidia_stable-10.1
8 Version bump 1 New
TOTAL 58

Notes on items already in v11.0.0 (items 1-5 from porting plan):

The following series are included in the upstream v11.0.0 release and require no additional patches:

Series Upstream Status
vSMMUv3 v9 (37 patches) Merged into master, included in v11.0.0
DMABUF v4 (3 patches) Merged into master, included in v11.0.0
vEVENTQ v8 (5 patches) Merged for v11.0.0
hugepfnmap v4 (3 patches) Picked up by maintainer, included in v11.0.0
SMMUv3 AUTO properties v5 (8 patches) In upstream/master, included in v11.0.0

Notes on dirty tracking (item 1):

Cherry-picked from upstream commit 659275f84694:

  • Merged into upstream/master but did not make the v11.0.0 release (ETA v11.1)
  • Disables dirty tracking for nesting parent HWPT to avoid failures on platforms like SMMUv3 with HTTU

Notes on SMMUv3 Resolve AUTO properties v3 (item 3):

7 patches backported from v3 posting. An additional prerequisite commit (4c7fefc2d0 by Philippe Mathieu-Daudé) was cherry-picked from upstream/master to provide the bool smmuv3_accel_init(Error **errp) signature that v3 depends on.

Conflict resolutions were required for patches 1-5 due to the v11.0.0 base having a #ifndef CONFIG_ARM_SMMUV3_ACCEL guard in smmu_validate_property() that the upstream v3 patches do not expect (removed by the Philippe prerequisite commit). Patch 6 required manual application of hw/core/machine.c changes to add the hw_compat_11_0 array, which does not exist on the v11.0.0 base.

Notes on Tegra241 CMDQV v4 (item 4):

31 patches backported from v4 posting. Conflict resolutions were required for 3 patches due to the Resolve AUTO properties v3 series changing the smmuv3_accel_init signature and adding smmuv3_machine_done(), which shifted context in hw/arm/smmuv3-accel.c.

Notes on packaging (item 7):

Ubuntu Noble debian packaging base copied from ubuntu/noble-updates (8.2.2+ds-0ubuntu1.16), then NVIDIA-specific packaging fixes cherry-picked from nvidia_stable-10.1. Two 10.1-specific commits were skipped as no-ops on 11.0:

  • "Update microvm-devices.mak for QEMU 10.1 compatibility" (already fixed in 11.0 noble-updates base)
  • Version bump commits (1:10.1.0+nvidia*)

Lore Links:

Upstream Status:

Series Status
vSMMUv3 v9, DMABUF v4, vEVENTQ v8, hugepfnmap v4, AUTO properties v5 ✅ In v11.0.0
Control dirty tracking ✅ Merged in master (ETA v11.1)
smmuv3_accel_init Error* refactor (Philippe) ✅ Merged in master (ETA v11.1)
SMMUv3 Resolve AUTO properties v3 ⏳ v3 posted, under review
Tegra241 CMDQV v4 ⏳ v4 posted, under review
ACPI GI insertion order v2 ⏳ v2 posted, under review
EGM support ❌ Out-of-tree (NVIDIA SAUCE)

Testing

  • ARM64 PPA build (noble)
  • Boot test on ARM64 system with SMMUv3 accel
  • VFIO device passthrough with CMDQV
  • EGM functionality validation
  • AUTO property resolution (ats, ril, ssidsize, oas)

Notes

  • Branch: nvidia-11.0-rebase-2026-05-06 based on tag v11.0.0
  • The hw_compat_11_0 array was manually added to hw/core/machine.c and include/hw/core/boards.h as it does not exist on the v11.0.0 base (it is present on upstream/master for post-11.0 development)
  • EGM patches are NVIDIA SAUCE carried from nvidia_stable-10.1; no upstream path exists for these

Shameer Kolothum and others added 30 commits May 7, 2026 03:00
QEMU smmuv3 accel does not support live migration yet, so dirty
tracking for the nesting parent HWPT is not useful.

Also, nested vIOMMU use cases can break on some platforms. For
example, SMMUv3 with HTTU may advertise dirty tracking capability,
but the kernel supports it only for stage-1. Requesting dirty
tracking for a nesting parent HWPT (stage-2) can fail.

Add a vIOMMU flag to explicitly request dirty tracking for the
nesting parent HWPT. For nested cases, dirty tracking is enabled
only when requested by the vIOMMU.

Non-nested cases and Intel vIOMMU keep the existing behavior.

Fixes: fc6dafb ("hw/arm/smmuv3: Implement get_viommu_cap() callback")
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Link: https://lore.kernel.org/qemu-devel/20260401084133.56266-1-skolothumtho@nvidia.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
(cherry picked from commit 659275f84694e7b06d67d877137905d371a3fde4)
Signed-off-by: Jiandi An <jan@nvidia.com>
By giving smmuv3_accel_init() the ability to populate an error,
we can fail early in smmu_realize() when CONFIG_ARM_SMMUV3_ACCEL
is not available, simplifying smmu_validate_property().

Suggested-by: Shameer Kolothum Thodi <skolothumtho@nvidia.com>
Co-developed-by: Shameer Kolothum Thodi <skolothumtho@nvidia.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Shameer Kolothum <skolothumtho@nvidia.com>
Message-Id: <20260410200031.18572-2-philmd@linaro.org>
(cherry picked from commit 4c7fefc2d043a66f799cf2c2e34ed680b1b44b5c)
…ameters

Introduce smmuv3_accel_auto_finalise() to resolve properties that are
set to 'auto' for accelerated SMMUv3. This helper function allows
properties such as ats, ril, ssidsize, and oas support to be resolved
from host IOMMU capabilities via IOMMU_GET_HW_INFO.

The later commits in this series set the auto_mode flag to true when
an accel SMMUv3 property value is explicitly set to 'auto', or if the
property value is not set and defaults to auto mode.

Setting these property values to 'auto' requires at least one
cold-plugged device to retrieve and finalise these properties. If the
auto_mode flag is true, register a machine_init_done notifier to
verify this requirement and fail boot if it is not met.

Hot-plugged devices into an accel SMMUv3-associated bus will re-use
the resolved host values from the initial cold-plug.

Subsequent patches will make use of this helper to resolve 'auto' to
what is reported by host IOMMU capabilities.

Suggested-by: Shameer Kolothum <skolothumtho@nvidia.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Nathan Chen <nathanc@nvidia.com>
(backported from https://lore.kernel.org/all/20260512193520.3109172-1-nathanc@nvidia.com/)
Signed-off-by: Jiandi An <jan@nvidia.com>
Allow accelerated SMMUv3 Address Translation Services support property
to be derived from host IOMMU capabilities. Derive host values using
IOMMU_GET_HW_INFO, retrieving ATS capability from IDR0.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Nathan Chen <nathanc@nvidia.com>
(backported from https://lore.kernel.org/all/20260512193520.3109172-1-nathanc@nvidia.com/)
Signed-off-by: Jiandi An <jan@nvidia.com>
Allow accelerated SMMUv3 Range Invalidation support property to be
derived from host IOMMU capabilities. Derive host values using
IOMMU_GET_HW_INFO, retrieving RIL capability from IDR3.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Nathan Chen <nathanc@nvidia.com>
(backported from https://lore.kernel.org/all/20260512193520.3109172-1-nathanc@nvidia.com/)
Signed-off-by: Jiandi An <jan@nvidia.com>
…ize"

Allow accelerated SMMUv3 SSID size property to be derived from host
IOMMU capabilities. Derive host values using IOMMU_GET_HW_INFO,
retrieving SSID size from IDR1. When the auto SSID size is resolved
to a non-zero value, PASID capability is advertised to the vIOMMU
and accelerated use cases such as Shared Virtual Addressing (SVA)
are supported.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Nathan Chen <nathanc@nvidia.com>
(backported from https://lore.kernel.org/all/20260512193520.3109172-1-nathanc@nvidia.com/)
Signed-off-by: Jiandi An <jan@nvidia.com>
Allow accelerated SMMUv3 OAS property to be derived from host IOMMU
capabilities. Derive host values using IOMMU_GET_HW_INFO, retrieving
OAS from IDR5.

This keeps the OAS value advertised by the virtual SMMU compatible with
the capabilities of the host SMMUv3, so that the intermediate physical
addresses (IPA) consumed by host SMMU for stage-2 translation do not
exceed the host's max supported IPA size.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Nathan Chen <nathanc@nvidia.com>
(backported from https://lore.kernel.org/all/20260512193520.3109172-1-nathanc@nvidia.com/)
Signed-off-by: Jiandi An <jan@nvidia.com>
…auto

Set the default value of ATS, RIL, SSIDSIZE, and OAS to auto, in order
to match the host IOMMU properties when accel=on.

If accel=off and these property values are set to auto, the default
property values defined in smmuv3_init_id_regs() for OAS and RIL will
remain unchanged, while SSIDSIZE and ATS values will remain initialized
at 0.

Introduce a new compat for the changed defaults.

Signed-off-by: Nathan Chen <nathanc@nvidia.com>
(backported from https://lore.kernel.org/all/20260512193520.3109172-1-nathanc@nvidia.com/)
[jan: manually apply machine.c changes — hw_compat_11_0 array does not exist on v11.0.0 base; added array, include, and boards.h declaration]
Signed-off-by: Jiandi An <jan@nvidia.com>
…rties

Update documentation now that "auto" is supported for accelerated SMMUv3
properties.

Signed-off-by: Nathan Chen <nathanc@nvidia.com>
(backported from https://lore.kernel.org/all/20260512193520.3109172-1-nathanc@nvidia.com/)
Signed-off-by: Jiandi An <jan@nvidia.com>
The updated IOMMUFD uAPI introduces the ability for userspace to request
a specific hardware info data type via IOMMU_GET_HW_INFO. Update
iommufd_backend_get_device_info() to set IOMMU_HW_INFO_FLAG_INPUT_TYPE
when a non-zero type is supplied, and adjust all callers to pass a type
value explicitly initialised to zero (IOMMU_HW_INFO_TYPE_DEFAULT) when
no specific type is requested.

Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
(backported from https://lore.kernel.org/qemu-devel/20260415105552.622421-1-skolothumtho@nvidia.com/)
[jan: resolve conflict due to 659275f846 (Control dirty tracking for nesting parent HWPT) adding viommu_nesting vars shifting context in iommufd_cdev_autodomains_get()]
Signed-off-by: Jiandi An <jan@nvidia.com>
…to allow user ptr

The updated IOMMUFD VIOMMU_ALLOC uAPI allows userspace to provide a data
buffer when creating a vIOMMU (e.g. for Tegra241 CMDQV). Extend
iommufd_backend_alloc_viommu() to pass a user pointer and size to the
kernel.

Update the caller accordingly.

Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
(backported from https://lore.kernel.org/qemu-devel/20260415105552.622421-1-skolothumtho@nvidia.com/)
Signed-off-by: Jiandi An <jan@nvidia.com>
…ueue

Add a helper to allocate an iommufd backed HW queue for a vIOMMU.

While at it, define a struct IOMMUFDHWqueue for use by vendor
implementations.

Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
(backported from https://lore.kernel.org/qemu-devel/20260415105552.622421-1-skolothumtho@nvidia.com/)
Signed-off-by: Jiandi An <jan@nvidia.com>
Add a backend helper to mmap hardware MMIO regions exposed via iommufd for
a vIOMMU instance. This allows user space to access HW-accelerated MMIO
pages provided by the vIOMMU.

The caller is responsible for unmapping the returned region.

Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
(backported from https://lore.kernel.org/qemu-devel/20260415105552.622421-1-skolothumtho@nvidia.com/)
Signed-off-by: Jiandi An <jan@nvidia.com>
…UFDVeventq

The viommu field is assigned but never used. Callers freeing the
veventq already have access to the IOMMUFDViommu object through other
references, so this field is redundant.

Removing it also simplifies upcoming changes where veventq is
allocated based on the viommu id before the IOMMUFDViommu object is
created (e.g. vendor CMDQV-based veventq allocation).

No functional change.

Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
(backported from https://lore.kernel.org/qemu-devel/20260415105552.622421-1-skolothumtho@nvidia.com/)
Signed-off-by: Jiandi An <jan@nvidia.com>
Command Queue Virtualization (CMDQV) is a hardware extension available
on certain platforms that allows the SMMUv3 command queue to be
virtualized and passed through to a VM, improving performance.

For example, NVIDIA Tegra241 implements CMDQV to support virtualization
of multiple command queues (VCMDQs).

The term CMDQV is used here generically to refer to any platform that
provides hardware support to virtualize the SMMUv3 command queue.

CMDQV support is a specialization of the IOMMUFD-backed accelerated
SMMUv3 path. Introduce an ops interface to factor out CMDQV-specific
probe, initialization, and vIOMMU allocation logic from the base
implementation. The ops pointer and associated state are stored in
the accelerated SMMUv3 state.

This provides an extensible design to support future vendor-specific
CMDQV implementations.

No functional change.

Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
(backported from https://lore.kernel.org/qemu-devel/20260415105552.622421-1-skolothumtho@nvidia.com/)
[jan: resolve conflict due to 79fcbec (Introduce smmuv3 accel device) adding CONFIG_DEVICES include shifting context in smmuv3-accel.h]
Signed-off-by: Jiandi An <jan@nvidia.com>
…stub

Introduce a Tegra241 CMDQV backend that plugs into the SMMUv3 accelerated
CMDQV ops interface.

This patch wires up the Tegra241 CMDQV backend and provides a stub
implementation for CMDQV probe, initialization, vIOMMU allocation
and reset handling.

Functional CMDQV support is added in follow-up patches.

Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
(backported from https://lore.kernel.org/qemu-devel/20260415105552.622421-1-skolothumtho@nvidia.com/)
[jan: resolve conflict due to 79fcbec (Introduce smmuv3 accel device) changing arm_common_ss to arm_ss for smmuv3 entries in hw/arm/meson.build]
Signed-off-by: Jiandi An <jan@nvidia.com>
Add support for selecting and initializing a CMDQV backend based on the
cmdqv OnOffAuto property.

If set to OFF, CMDQV is not used and the default IOMMUFD-backed allocation
path is taken.

If set to AUTO, QEMU attempts to probe a CMDQV backend during device setup.
If probing succeeds, the selected ops are stored in the accelerated SMMUv3
state and used. If probing fails, QEMU silently falls back to the default
path.

If set to ON, QEMU requires CMDQV support. Probing is performed during
setup and failure results in an error.

When a CMDQV backend is active, its callbacks are used for vIOMMU
allocation, free, and reset handling. Otherwise, the base implementation
is used.

The current implementation wires up the Tegra241 CMDQV backend through the
generic ops interface. Functional CMDQV behaviour is added in subsequent
patches.

No functional change.

Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
(backported from https://lore.kernel.org/qemu-devel/20260415105552.622421-1-skolothumtho@nvidia.com/)
Signed-off-by: Jiandi An <jan@nvidia.com>
Introduce a GPtrArray in VirtMachineState to track all SMMUv3 devices
created on the virt machine, and use it when building the IORT table
instead of relying on object_child_foreach_recursive() walks of the
object tree.

This avoids recursive object traversal and provides a foundation for
subsequent patches that need direct access to SMMUv3 instances for
CMDQV-related handling.

No functional change. No bios-tables qtest failures observed.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
(backported from https://lore.kernel.org/qemu-devel/20260415105552.622421-1-skolothumtho@nvidia.com/)
Signed-off-by: Jiandi An <jan@nvidia.com>
Use IOMMU_GET_HW_INFO to query host support for Tegra241 CMDQV.

Validate the returned data type, version, and minimum number of vCMDQs and
SIDs per Tegra241 CMDQ Virtual Interface(VI). Fail the probe if the host
does not meet these requirements.

The QEMU model supports one Virtual Interface(VI) per VM with 2 vCMDQs and
16 SIDs per VI, so the probe ensures the host implementation is compatible
with these limits.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
(backported from https://lore.kernel.org/qemu-devel/20260415105552.622421-1-skolothumtho@nvidia.com/)
Signed-off-by: Jiandi An <jan@nvidia.com>
Tegra241 CMDQV extends SMMUv3 with support for virtual command queues
(VCMDQs) exposed via a CMDQV MMIO region. The CMDQV MMIO space is split
into 64KB pages:

0x00000  (CMDQ-V Config page)
0x10000  (CMDQ-V CMDQ Page0)
0x20000  (CMDQ-V CMDQ Page1)
0x30000  (Virtual Interface Page0)
0x40000  (Virtual Interface Page1)

This patch wires up the Tegra241 CMDQV init callback and allocates
vendor-specific CMDQV state. The state pointer is stored in
SMMUv3AccelState for use by subsequent CMDQV operations.

The CMDQV MMIO region and a dedicated IRQ line are registered with the
SMMUv3 device. The MMIO read/write handlers are currently stubs and will
be implemented in later patches.

The CMDQV interrupt is edge-triggered and indicates VCMDQ or VINTF
error conditions. This patch only registers the IRQ line. Interrupt
generation and propagation to the guest will be added in a subsequent
patch.

Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Co-developed-by: Shameer Kolothum <skolothumtho@nvidia.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
(backported from https://lore.kernel.org/qemu-devel/20260415105552.622421-1-skolothumtho@nvidia.com/)
Signed-off-by: Jiandi An <jan@nvidia.com>
SMMUv3 devices with acceleration may enable CMDQV extensions
after device realize. In that case, additional MMIO regions and
IRQ lines may be registered but not yet mapped to the platform bus.

Ensure SMMUv3 device resources are linked to the platform bus
during machine_done().

This is safe to do unconditionally since the platform bus helpers
skip resources that are already mapped.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
(backported from https://lore.kernel.org/qemu-devel/20260415105552.622421-1-skolothumtho@nvidia.com/)
Signed-off-by: Jiandi An <jan@nvidia.com>
Replace the stub implementation with real vIOMMU allocation for
Tegra241 CMDQV.

Allocate a matching vEVENTQ together with the vIOMMU, since it is
specific to the Tegra241 CMDQV vIOMMU and used to receive CMDQV
events.

Free both objects on teardown.

Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Co-developed-by: Shameer Kolothum <skolothumtho@nvidia.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
(backported from https://lore.kernel.org/qemu-devel/20260415105552.622421-1-skolothumtho@nvidia.com/)
Signed-off-by: Jiandi An <jan@nvidia.com>
Tegra241 CMDQV exposes control and status registers in the CMDQ-V
Config page (offset [0x0, 0x10000)) used to configure virtual command
queue allocation and interrupt behavior.

Add read/write emulation for the CMDQ-V Config region
([CMDQV_BASE, CMDQV_CMDQ_BASE]), backed by a simple register cache.
This includes CONFIG, PARAM, STATUS, VI error and interrupt maps, CMDQ
allocation map and the VINTF0 related registers defined in the CMDQ-V
Config space. Only VINTF0 is supported; VINTF1-63 are not.

Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
(backported from https://lore.kernel.org/qemu-devel/20260415105552.622421-1-skolothumtho@nvidia.com/)
Signed-off-by: Jiandi An <jan@nvidia.com>
Tegra241 CMDQV exposes per-VCMDQ register windows through two MMIO
apertures:

  CMDQV_CMDQ_BASE (0x10000/0x20000): VCMDQ Page0/Page1
  CMDQV_VI_CMDQ_BASE (0x30000/0x40000): VINTF VCMDQ Page0/Page1

VINTF Page0 (0x30000) and VCMDQ Page0 (0x10000) are hardware aliases
addressing the same underlying registers. Add read emulation for both
apertures, backed by a register cache. VINTF Page0 reads are translated
to their VCMDQ Page0 equivalent and served from the same cached state.

Once IOMMU_HW_QUEUE_ALLOC and viommu_mmap are wired up in a subsequent
patch, Page0 register reads will be served directly from the hardware
backed mmap'd page instead of the cache. Page1 registers are always
served from cache.

Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
(backported from https://lore.kernel.org/qemu-devel/20260415105552.622421-1-skolothumtho@nvidia.com/)
Signed-off-by: Jiandi An <jan@nvidia.com>
This is the write side counterpart of the VCMDQ read emulation. Add write
handling for both CMDQV_CMDQ_BASE and CMDQV_VI_CMDQ_BASE apertures using
the same index decoding and VINTF-to-VCMDQ translation logic as the read
path.

VINTF aperture writes are translated to their CMDQV_CMDQ_BASE equivalent
and update the same cached state. Page1 registers (BASE, CONS_INDX_BASE)
always update the cache. Once IOMMU_HW_QUEUE_ALLOC and viommu_mmap are
wired up in a subsequent patch, Page0 register writes will be forwarded
to the hardware-backed mmap'd page.

Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
(backported from https://lore.kernel.org/qemu-devel/20260415105552.622421-1-skolothumtho@nvidia.com/)
Signed-off-by: Jiandi An <jan@nvidia.com>
The CMDQ-V CMDQ pages provide a VM wide view of all VCMDQs, while the
VINTF pages expose a logical view local to a given VINTF. Although real
hardware may support multiple VINTFs, the kernel currently exposes a
single VINTF per VM.

The kernel provides an mmap offset for the VINTF Page0 region during
vIOMMU allocation. However, the logical-to-physical association between
VCMDQs and a VINTF is only established after HW_QUEUE allocation. Prior
to that, the mapped Page0 does not back any real VCMDQ state.

When VINTF is enabled, mmap the kernel provided Page0 region and set
ENABLE_OK only if the mmap succeeds. Unmap it when VINTF is disabled.
This prepares the VINTF mapping in advance of subsequent patches that
add VCMDQ allocation support.

Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Co-developed-by: Shameer Kolothum <skolothumtho@nvidia.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
(backported from https://lore.kernel.org/qemu-devel/20260415105552.622421-1-skolothumtho@nvidia.com/)
Signed-off-by: Jiandi An <jan@nvidia.com>
Introduce address_space_is_ram(), a helper to determine whether
a guest physical address resolves to a RAM-backed MemoryRegion within
an AddressSpace.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
(backported from https://lore.kernel.org/qemu-devel/20260415105552.622421-1-skolothumtho@nvidia.com/)
Signed-off-by: Jiandi An <jan@nvidia.com>
…ster programming

Add support for allocating IOMMUFD hardware queues when the guest
programs the VCMDQ BASE registers.

VCMDQ_EN is part of the VCMDQ_CONFIG register, which is accessed
through the VINTF Page0 region. A subsequent patch maps this region
directly into the guest address space, so QEMU does not trap writes
to VCMDQ_CONFIG.

Since VCMDQ_EN writes are not trapped, QEMU cannot allocate the
hardware queue based on that bit. Instead, allocate the IOMMUFD
hardware queue when the guest writes a VCMDQ BASE register with a
valid RAM-backed address and when CMDQV and VINTF are enabled.

If a hardware queue was previously allocated for the same VCMDQ,
free it before reallocation.

Writes with invalid addresses are ignored.

All allocated VCMDQs are freed when CMDQV or VINTF is disabled.

Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
(backported from https://lore.kernel.org/qemu-devel/20260415105552.622421-1-skolothumtho@nvidia.com/)
Signed-off-by: Jiandi An <jan@nvidia.com>
… backing

Introduce tegra241_cmdqv_vintf_ptr() to route VCMDQ register accesses
through the mmap'd VINTF page0 backing once a hardware queue has been
allocated.

There are two QEMU trapped MMIO apertures for VCMDQ registers:

  - Direct VCMDQ aperture (offset 0x10000)
  - VINTF Page0 (offset 0x30000)

These are hardware aliases: they address the same underlying registers.
A subsequent patch maps the VINTF aperture as a guest-direct RAM region;
in this patch both remain QEMU-trapped.

VCMDQ register accesses operate in one of two mutually exclusive modes,
depending on whether a hardware queue (IOMMU_HW_QUEUE_ALLOC) has been
allocated for the VCMDQ:

Pre-alloc: vintf_ptr is NULL. Both apertures use QEMU's register
cache. Hardware is not yet engaged;

Post-alloc: vintf_ptr is valid. Both QEMU trapped apertures access
registers directly via the mmap'd vintf_page0 pointer, bypassing
the cache. Hardware is the single source of truth.

The pre-to-post-alloc transition is triggered by the BASE register write
that initiates IOMMU_HW_QUEUE_ALLOC. No cache-to-hardware synchronisation
is needed at transition time. The hardware mandated init sequence requires
BASE to be written first; PROD_INDX, CONS_INDX and CONFIG.CMDQ_EN are
programmed only after BASE and are therefore always post-alloc.

Any pre-alloc writes to those registers update only the register cache,
which is discarded at the transition.

CMDQV acceleration only becomes active once the guest enables VINTF and
programs the VCMDQ BASE register. Until then, all VCMDQ accesses are
served from the emulated register cache with no real hardware command
processing. This matches the CMDQV hardware specification: if the logical
CMDQ index does not map to any allocated Virtual CMDQ, "the access is
dropped with no Fault/Interrupt".

Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
(backported from https://lore.kernel.org/qemu-devel/20260415105552.622421-1-skolothumtho@nvidia.com/)
Signed-off-by: Jiandi An <jan@nvidia.com>
Some RAM device regions created with memory_region_init_ram_device_ptr()
are not intended to be P2P DMA targets.

The VFIO listener currently treats all RAM device regions as DMA
capable and attempts to map them into the IOMMU. For regions without
dma-buf backing this fails and prints warnings such as:

  IOMMU_IOAS_MAP failed: Bad address, PCI BAR?

Introduce a MemoryRegion flag (ram_device_skip_iommu_map) to mark RAM
device regions that should not be IOMMU mapped. When set, the VFIO
listener skips DMA mapping for that region.

Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
(backported from https://lore.kernel.org/qemu-devel/20260415105552.622421-1-skolothumtho@nvidia.com/)
Signed-off-by: Jiandi An <jan@nvidia.com>
nicolinc and others added 28 commits May 14, 2026 03:58
… space

Once a VCMDQ is allocated, map the mmap'd vintf_page0 region directly
into the guest-visible MMIO space at offset 0x30000 as a RAM-backed
MemoryRegion. This eliminates QEMU trapping for hot-path CONS/PROD
index updates.

After this patch, the two VCMDQ apertures use different access paths:
the direct aperture (0x10000) remains QEMU-trapped and writes via
vintf_ptr, while the VI aperture (0x30000) is a direct guest RAM
mapping. Both paths write to the same underlying vintf_page0 memory,
so no synchronisation between the apertures is needed.

The mapping is installed lazily on first successful VCMDQ hardware
queue allocation and removed when CMDQV or VINTF is disabled.

Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Co-developed-by: Shameer Kolothum <skolothumtho@nvidia.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
(backported from https://lore.kernel.org/qemu-devel/20260415105552.622421-1-skolothumtho@nvidia.com/)
Signed-off-by: Jiandi An <jan@nvidia.com>
…tq read

Move the vEVENTQ read and validation logic into a common helper
smmuv3_accel_event_read_validate(). The helper performs the read(),
checks for overflow and short reads, validates the sequence number,
and updates the sequence state.

This helper can be reused for Tegra241 CMDQV vEVENTQ support in a
subsequent patch.

Error handling is slightly adjusted: instead of reporting errors
directly in the read handler, the helper now returns errors via
Error **. Sequence gaps are reported as warnings.

Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
(backported from https://lore.kernel.org/qemu-devel/20260415105552.622421-1-skolothumtho@nvidia.com/)
[jan: resolve conflict due to v11.0.0 using inline stubs in smmuv3-accel.h instead of separate smmuv3-accel-stubs.c; added declaration and inline stub for smmuv3_accel_event_read_validate()]
Signed-off-by: Jiandi An <jan@nvidia.com>
…QV errors

Install an event handler on the CMDQV vEVENTQ fd to read and propagate
host received CMDQV errors to the guest.

The handler runs in QEMU’s main loop, using a non-blocking fd registered
via qemu_set_fd_handler().

Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
(backported from https://lore.kernel.org/qemu-devel/20260415105552.622421-1-skolothumtho@nvidia.com/)
Signed-off-by: Jiandi An <jan@nvidia.com>
Introduce a reset handler for the Tegra241 CMDQV and initialize its
register state.

Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
(backported from https://lore.kernel.org/qemu-devel/20260415105552.622421-1-skolothumtho@nvidia.com/)
Signed-off-by: Jiandi An <jan@nvidia.com>
…nd page size

CMDQV HW reads guest queue memory in its host physical address setup via
IOMMUFD. This requires the guest queue memory is not only contiguous in
guest PA space but also in host PA space. With Tegra241 CMDQV enabled, we
must only advertise a CMDQS that the host can safely back with physically
contiguous memory. Allowing a queue larger than the host page size could
cause the hardware to DMA across page boundaries, leading to faults.

Walk the RAMBlock list to find the smallest memory-backend page size, then
limit IDR1.CMDQS so the guest cannot configure a command queue that exceeds
that contiguous backing. Fall back to the real host page size if no
memory-backend RAM blocks are found.

Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
(backported from https://lore.kernel.org/qemu-devel/20260415105552.622421-1-skolothumtho@nvidia.com/)
Signed-off-by: Jiandi An <jan@nvidia.com>
Add an "identifier" property to the SMMUv3 device and use it when
building the ACPI IORT SMMUv3 node Identifier field.

This avoids relying on device enumeration order and provides a stable
per-device identifier. A subsequent patch will use the same identifier
when generating the DSDT description for Tegra241 CMDQV, ensuring that
the IORT and DSDT entries refer to the same SMMUv3 instance.

The identifier is assigned at pre-plug time, accounting for the ITS Group
node that build_iort() places before SMMUv3 nodes in the IORT table, so
that identifiers are globally unique across all IORT nodes.

No functional change: IORT blob content for bios-tables qtest is identical
to before.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
(backported from https://lore.kernel.org/qemu-devel/20260415105552.622421-1-skolothumtho@nvidia.com/)
Signed-off-by: Jiandi An <jan@nvidia.com>
Introduce a SMMUv3AccelCmdqvType enum and a helper to query the
CMDQV implementation type associated with an accelerated SMMUv3
instance.

A subsequent patch will use this helper when generating the
Tegra241 CMDQV DSDT.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
(backported from https://lore.kernel.org/qemu-devel/20260415105552.622421-1-skolothumtho@nvidia.com/)
[jan: resolve conflict — keep bool smmuv3_accel_init(Error **errp) signature from cherry-picked 4c7fefc2d0; add smmuv3_accel_cmdqv_type() before it]
Signed-off-by: Jiandi An <jan@nvidia.com>
Add ACPI DSDT support for Tegra241 CMDQV when the SMMUv3 instance is
created with tegra241-cmdqv.

The SMMUv3 device identifier is used as the ACPI _UID. This matches
the Identifier field of the corresponding SMMUv3 IORT node, allowing
the CMDQV DSDT device to be correctly associated with its SMMU.

Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Co-developed-by: Shameer Kolothum <skolothumtho@nvidia.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
(backported from https://lore.kernel.org/qemu-devel/20260415105552.622421-1-skolothumtho@nvidia.com/)
Signed-off-by: Jiandi An <jan@nvidia.com>
…MDQV is active

When CMDQV is active, the first cold-plugged VFIO device establishes the
viommu to host SMMUv3 association. Block its hot-unplug to preserve this
association and the guest's boot time CMDQV configuration.

Also abort at machine_done if cmdqv=on is requested but no cold-plugged
VFIO device was present to initialize it.

Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
(backported from https://lore.kernel.org/qemu-devel/20260415105552.622421-1-skolothumtho@nvidia.com/)
[jan: resolve conflict — 950e2c91c8 moved smmuv3_machine_done to smmuv3-accel.c; added unplug_blocker field and cmdqv check in correct file]
Signed-off-by: Jiandi An <jan@nvidia.com>
Introduce a "cmdqv" property to enable Tegra241 CMDQV support.
This is only enabled for accelerated SMMUv3 devices.

Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
(backported from https://lore.kernel.org/qemu-devel/20260415105552.622421-1-skolothumtho@nvidia.com/)
Signed-off-by: Jiandi An <jan@nvidia.com>
During creation of the VM's SRAT table, the generic initiator entries
are added. Currently the order in the entries are not controllable from
the qemu command. This is due to the fact that the code queries the
object tree which may not be in the order objects were inserted.

As a fix the patch maintains a GPtrArray of generic initiator objects
that preserves their insertion order. Objects are automatically added
to the array when initialized and removed when finalized. When building
the SRAT table, objects are processed in the order they were first
inserted.

E.g. for the following qemu command.
...
            -object acpi-generic-initiator,id=gi0,pci-dev=dev0,node=2 \
            -object acpi-generic-initiator,id=gi1,pci-dev=dev0,node=3 \
            -object acpi-generic-initiator,id=gi2,pci-dev=dev0,node=4 \
            -object acpi-generic-initiator,id=gi3,pci-dev=dev0,node=5 \
            -object acpi-generic-initiator,id=gi4,pci-dev=dev0,node=6 \
            -object acpi-generic-initiator,id=gi5,pci-dev=dev0,node=7 \
            -object acpi-generic-initiator,id=gi6,pci-dev=dev0,node=8 \
            -object acpi-generic-initiator,id=gi7,pci-dev=dev0,node=9 \
...

Original PXM in the VM SRAT table:
[1A4h 0420 004h]            Proximity Domain : 00000007
[1C4h 0452 004h]            Proximity Domain : 00000006
[1E4h 0484 004h]            Proximity Domain : 00000005
[204h 0516 004h]            Proximity Domain : 00000004
[224h 0548 004h]            Proximity Domain : 00000003
[244h 0580 004h]            Proximity Domain : 00000009
[264h 0612 004h]            Proximity Domain : 00000002
[284h 0644 004h]            Proximity Domain : 00000008
[2A2h 0674 004h]            Proximity Domain : 00000009

After the patch (preserves insertion order):
[1A4h 0420 004h]            Proximity Domain : 00000002
[1C4h 0452 004h]            Proximity Domain : 00000003
[1E4h 0484 004h]            Proximity Domain : 00000004
[204h 0516 004h]            Proximity Domain : 00000005
[224h 0548 004h]            Proximity Domain : 00000006
[244h 0580 004h]            Proximity Domain : 00000007
[264h 0612 004h]            Proximity Domain : 00000008
[284h 0644 004h]            Proximity Domain : 00000009

cc: Shameer Kolothum <skolothumtho@nvidia.com>
Fixes: 0a5b5ac ("hw/acpi: Implement the SRAT GI affinity structure")
(backported from https://lore.kernel.org/all/20260223112236.000065aa@huawei.com/)
[ankita: ML links to discussion and not patch as the original ML posting was lost]
Signed-off-by: Ankit Agrawal <ankita@nvidia.com>
Acked-by: Matthew R. Ochs <mochs@nvidia.com>
Acked-by: Mitchell Augustin <mitchell.augustin@canonical.com>
Signed-off-by: Matthew R. Ochs <mochs@nvidia.com>
(cherry picked from commit 70ac94e)
Signed-off-by: Jiandi An <jan@nvidia.com>
The Extended GPU Memory (EGM) feature [1] enables the GPU access to
the local or remote system memory across sockets and nodes. In
this mode, the physical memory can be allocated for GPU usage from
anywhere in a multi-node system. The feature is being extended to
virtualization.

The CPU node with the EGM is associated with the GPUs present
on the same socket in a way that the EGM node information such
as its base physical address, length and the proximity domain ID
is populated in the ACPI DSDT entries of those associated GPUs.
This information is needed by the NVIDIA driver in the VM to
discover its local EGM memory.

The CPU memory being utilized as EGM is exposed as a
memory-backend-file /dev/egmX backed by the nvgrace-egm module.

To link the GPU devices to the CPU EGM node, a new qom object
acpi-egm-memory is introduced. This helps Qemu populate the DSDT
entries to the appropriate GPU device.

An admin can provide this association as following. In the example,
the NUMA node 0 has the EGM memory created through the /dev/egm4
device. This node is linked with the dev0 GPU device using the
acpi-egm-memory object.

...
-numa node,memdev=m0,cpus=0-3,nodeid=0 \
-object memory-backend-file,id=m0,mem-path=/dev/egm4,size=84G,share=on,prealloc=on \
-device vfio-pci-nohotplug,host=0008:01:00.0,bus=pcie.0,rombar=0,id=dev0 \
-object acpi-egm-memory,id=egm0,pci-dev=dev0,node=0
...

Link: https://developer.nvidia.com/blog/nvidia-grace-hopper-superchip-architecture-in-depth/#extended_gpu_memory [1]

Signed-off-by: Ankit Agrawal <ankita@nvidia.com>
(cherry picked from commit e647ae7 https://github.com/nvmochs/QEMU/tree/stable101_smmuv3-accel-07212025_egm)
Signed-off-by: Matthew R. Ochs <mochs@nvidia.com>
(cherry picked from commit 4954345)
[jan: resolve conflict in hw/acpi/meson.build — trivial context, added new EGM source files to list]
Signed-off-by: Jiandi An <jan@nvidia.com>
The Qemu code builds the ACPI DSDT for the VM devices. The
Extended GPU Memory (EGM) information such as physical address,
length and proximity domain ID is populated in the DSDT entries
of the GPU devices present in the same socket as the EGM memory.
This is used by the VM NVIDIA driver to determine the EGM properties.
The GPU device is linked with the EGM memory node through the
acpi-egm-memory object.

While building ACPI tables, go through all of the egm-memory objects.
Find the device and the EGM NUMA node association from the objects.
Patch the DSDT to create the GPU device entries and populate with
the corresponding NUMA node properties with DSDT object.

Signed-off-by: Ankit Agrawal <ankita@nvidia.com>
(cherry picked from commit a5eae1e https://github.com/nvmochs/QEMU/tree/stable101_smmuv3-accel-07212025_egm)
Signed-off-by: Matthew R. Ochs <mochs@nvidia.com>
(cherry picked from commit b62790f)
Signed-off-by: Jiandi An <jan@nvidia.com>
…ror pages on EGM

The nvgrace-egm module expose the list of pages with uncorrected ECC
errors (referred as bad pages) in EGM memory through a EGM_BAD_PAGES_LIST
ioctl on the EGM char device.

Since these pages should not be accessed by the VM OS, they need to
be kept absent from the VM memory map. This is achieved by leveraging
the memory region in DTBs.

Fetch the list of the pages by calling the ioctl and sort. The memory
regions are built using this list by terminating a memory region at
the physical address of a bad page. The next region is started from
the next page, essentially skipping over the bad page.

Also a minor code organization to move the fdt_add_memory_node to
a wrapper that checks if the provided memory region length is
non-zero before adding in DTB.

Signed-off-by: Ankit Agrawal <ankita@nvidia.com>
(cherry picked from commit dba314b https://github.com/nvmochs/QEMU/tree/stable101_smmuv3-accel-07212025_egm)
Signed-off-by: Matthew R. Ochs <mochs@nvidia.com>
(cherry picked from commit 2dd5ad8)
[jan: resolve conflict — fdt_add_pmem_node added in v11.0.0 at same location as EGM's fdt_add_memory_node_wrapper; kept both]
Signed-off-by: Jiandi An <jan@nvidia.com>
Signed-off-by: Ankit Agrawal <ankita@nvidia.com>
(cherry picked from commit 3f183b4 https://github.com/nvmochs/QEMU/tree/stable101_smmuv3-accel-07212025_egm)
Signed-off-by: Matthew R. Ochs <mochs@nvidia.com>
(cherry picked from commit 09d8e48)
Signed-off-by: Jiandi An <jan@nvidia.com>
The Extended GPU Memory (EGM) feature [1] enables the GPU access to
the local or remote system memory across sockets and nodes. In
this mode, the physical memory can be allocated for GPU usage from
anywhere in a multi-node system. The feature is being extended to
virtualization.

The EGM memory is exposed as a memory-backend-file backed by
the nvgrace-egm module. The EGM node information such as the
physical address, length and the proximity domain ID is
populated in the ACPI DSDT entries for the GPU devices present
on the same physical socket.

A new qom object acpi-egm-memory is introduced to link the GPU
devices to the EGM node.

It is possible for the EGM memory to have memory pages with ECC
errors, and such list is fetched using an ioctl provided by the
nvgrace-egm module. VM DTB memory regions are built using the
list by skipping such pages. The VM OS is thus prevented from
using those pages.

Link: https://developer.nvidia.com/blog/nvidia-grace-hopper-superchip-architecture-in-depth/#extended_gpu_memory [1]

Ankit Agrawal (3):
  qom: New object to associate device to EGM node
  hw/acpi: Populate DSD tables with EGM properties
  hw/arm/boot: Create DTB memory regions skipping ECC error pages on EGM

 hw/acpi/acpi_egm_memory.c         | 176 ++++++++++++++++++++++++++++++
 hw/acpi/meson.build               |   1 +
 hw/arm/boot.c                     | 129 ++++++++++++++++++++--
 hw/pci-host/gpex-acpi.c           |   5 +
 include/hw/acpi/acpi_egm_memory.h |  24 ++++
 linux-headers/linux/egm.h         |  20 ++++
 qapi/qom.json                     |  17 +++
 7 files changed, 363 insertions(+), 9 deletions(-)
 create mode 100644 hw/acpi/acpi_egm_memory.c
 create mode 100644 include/hw/acpi/acpi_egm_memory.h
 create mode 100644 linux-headers/linux/egm.h

--
2.34.1

Signed-off-by: Matthew R. Ochs <mochs@nvidia.com>
(cherry picked from commit fab2732 https://github.com/nvmochs/QEMU/tree/stable101_smmuv3-accel-07212025_egm)
Signed-off-by: Matthew R. Ochs <mochs@nvidia.com>
(cherry picked from commit 32db1b7)
Signed-off-by: Jiandi An <jan@nvidia.com>
There is an assert in build_append_nameseg() that enforces the
length of ACPI segments to 4 characters:

qemu-system-aarch64: hw/acpi/aml-build.c:202: build_append_nameseg: Assertion len <= ACPI_NAMESEG_LEN failed.

This assert can be hit when building the EGM DSDT entries because
the gpu_id can overflow a single character.

To avoid hitting this assert, always reset the gpu_id before building
the DSDT table.

Fixes: b62790f ("NVIDIA: SAUCE: hw/acpi: Populate DSDT with EGM properties")
Signed-off-by: Matthew R. Ochs <mochs@nvidia.com>
Acked-by: Mitchell Augustin <mitchell.augustin@canonical.com>
Acked-by: Ankit Agrawal <ankita@nvidia.com>
Signed-off-by: Matthew R. Ochs <mochs@nvidia.com>
(cherry picked from commit 338211a)
Signed-off-by: Jiandi An <jan@nvidia.com>
Add debian folder from Ubuntu Noble QEMU (noble-updates branch):
https://git.launchpad.net/ubuntu/+source/qemu/log/?h=ubuntu/noble-updates

Tag: import/1:8.2.2+ds-0ubuntu1.16
HEAD commit: b6ab27191 1:8.2.2+ds-0ubuntu1.16 (patches unapplied)

Signed-off-by: Jiandi An <jan@nvidia.com>
debian/control:
debian/control-in:
 - Added dependency support for meson-1.5
 - Added NVIDIA as maintainer

debian/patches/*:
debian/patches/series:
 - Remove all Debian/Ubuntu patches

debian/rules:
 - Removed pvrdma and cris/nios2 architectures (removed since v8.2)
 - Disabled firmware builds

debian/qemu-system-common.install:
 - Remove obsolete files (removed since v8.2)
 - Added hw-uefi-vars.so (added since v8.2)

Signed-off-by: Matthew R. Ochs <mochs@nvidia.com>
(cherry picked from commit 36fe037)
[jan: resolve conflict in debian/patches/series — 11.0 noble-updates base has different patches than 10.1; emptied series and deleted all patch files as intended]
Signed-off-by: Jiandi An <jan@nvidia.com>
QEMU is particularly challenging to package due to its multiple
subprojects that ship binary files.

To simplify and speed up maintenance, allow binaries to be included in
the source package.

Signed-off-by: Andrea Righi <arighi@nvidia.com>
Signed-off-by: Matthew R. Ochs <mochs@nvidia.com>
(cherry picked from commit 0b8d533)
Signed-off-by: Jiandi An <jan@nvidia.com>
This fixes a build failure in PPA environments where dh_missing reports
DTB files exist in debian/tmp but are not installed to any package.

QEMU 11.0's Meson build has a bug where it ignores the install_blobs=false
option when dtc is not found, causing DTB files to be auto-installed to
usr/share/qemu/dtb/. This creates inconsistent behavior between local builds
(with dtc installed) and PPA builds (without dtc).

Changes:
1. Add qemu-system-data.install with PPC DTBs (bamboo, canyonlands)
2. Add qemu-system-misc.install with MicroBlaze DTBs (petalogix-ml605, petalogix-s3adsp1800)
3. Update debian/rules install-misc paths to usr/share/qemu/dtb/
4. Patch pc-bios/dtb/meson.build to disable dtc detection, forcing use of pre-built files
5. Update debian/rules build-misc to copy pre-built DTBs instead of compiling
6. Uncomment sysdata-components += misc to enable manual DTB installation

This ensures both local (with dtc) and PPA (without dtc) builds behave
identically by always using pre-built DTB files from pc-bios/dtb/.

Signed-off-by: Matthew R. Ochs <mochs@nvidia.com>
(cherry picked from commit bcf1711)
Signed-off-by: Jiandi An <jan@nvidia.com>
Introduce debian/build-deb-from-git.sh to automate the creation of the
orig tarball from the Git repository for packaging purposes.

Signed-off-by: Andrea Righi <arighi@nvidia.com>
Signed-off-by: Matthew R. Ochs <mochs@nvidia.com>
(cherry picked from commit 0104065)
[jan: change reference of 10.1.0 to 11.0.0, update error message to suggest cleaning the main repo too in addition to just submodules, add missing Python wheels (wheel, setuptools, pip) to orig targall for offline builds]
Signed-off-by: Jiandi An <jan@nvidia.com>
The --enable-avx2 option was removed in QEMU 10.1. This option no longer
exists in the configure script, causing the xen build to fail with:
  ERROR: unknown option --enable-avx2

Ubuntu's QEMU 10.1 packaging (questing) also removed this option from
the xen build configuration.

This fixes the amd64 PPA build failure in the xen configuration stage.

Signed-off-by: Matthew R. Ochs <mochs@nvidia.com>
(cherry picked from commit 11ec431)
Signed-off-by: Jiandi An <jan@nvidia.com>
The microvm build target creates the binary directly in b/microvm/qemu-system-x86_64,
not in a subdirectory. The install-microvm target was looking in the wrong location:

  Wrong: b/microvm/x86_64-softmmu/qemu-system-x86_64
  Correct: b/microvm/qemu-system-x86_64

This matches Ubuntu's QEMU 10.1 packaging and fixes the amd64 build failure:
  cp: cannot stat 'b/microvm/x86_64-softmmu/qemu-system-x86_64': No such file or directory

Signed-off-by: Matthew R. Ochs <mochs@nvidia.com>
(cherry picked from commit 93e625e)
Signed-off-by: Jiandi An <jan@nvidia.com>
qemu-vmsr-helper is a VMware-specific tool new in QEMU 10.1 that provides
VMware VMSR (Virtual Machine Service Registry) compatibility features.

Since this is not needed for NVIDIA QEMU, we exclude it from the package
by adding it to debian/not-installed. This fixes the dh_missing error:
  usr/bin/qemu-vmsr-helper exists in debian/tmp but is not installed to anywhere

Signed-off-by: Matthew R. Ochs <mochs@nvidia.com>
(cherry picked from commit 3b4f945)
Signed-off-by: Jiandi An <jan@nvidia.com>
Adds a GitHub Action pipeline to automatically build a source
package that can be signed and uploaded to an LP builder

Signed-off-by: Mitchell Augustin <mitchell.augustin@canonical.com>
Signed-off-by: Matthew R. Ochs <mochs@nvidia.com>
(cherry picked from commit 73ecd2d)
Signed-off-by: Jiandi An <jan@nvidia.com>
Signed-off-by: Matthew R. Ochs <mochs@nvidia.com>
(cherry picked from commit 5dcef2c)
Signed-off-by: Jiandi An <jan@nvidia.com>
Update to QEMU v11.0.0 upstream with NVIDIA support

Signed-off-by: Jiandi An <jan@nvidia.com>
@JiandiAnNVIDIA JiandiAnNVIDIA force-pushed the nvidia_stable-11.0-virt branch from d298f44 to a85eb88 Compare May 16, 2026 06:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants