Skip to content

[26.04_linux-nvidia-bos] BPMP ACPI + SoC Hub MBWT sysfs for T410/NVL72#421

Open
kobak2026 wants to merge 3 commits into
NVIDIA:26.04_linux-nvidia-bosfrom
kobak2026:bug-DGX-16100/bpmp-acpi-mbwt-v4-26.04-bos
Open

[26.04_linux-nvidia-bos] BPMP ACPI + SoC Hub MBWT sysfs for T410/NVL72#421
kobak2026 wants to merge 3 commits into
NVIDIA:26.04_linux-nvidia-bosfrom
kobak2026:bug-DGX-16100/bpmp-acpi-mbwt-v4-26.04-bos

Conversation

@kobak2026
Copy link
Copy Markdown
Collaborator

@kobak2026 kobak2026 commented May 14, 2026

Summary

Backport Aniruddha's V4 internal-review 3-patch BPMP ACPI + SoC Hub MBWT sysfs series to the 7.0-HWE / 26.04 BOS kernel branch. This is the V4 internal-mail design, not PR352's older export/unexport sysfs layout.

This is based on Aniruddha's V4 internal-review series; V5 is under review, so material upstream deltas will be handled as a follow-up refresh if needed.

V4 vs PR352

  • Sysfs path is now /sys/bus/platform/devices/NVDA3001:NN/mbwt_control/{pcie_instance_id,vc_type,bandwidth}.
  • bandwidth reads back a 3-tuple: <pcie_instance_id> <vc_type> <bandwidth>.
  • mbwt_control is guarded by BPMP QUERY_ABI support for both GET_BW and SET_BW.
  • No bpmp-tegra410.c is introduced; the BPMP ACPI path is integrated into the existing driver structure.
  • The old PR352 /sys/class/bpmp_mbwt/.../{export,unexport,mbwtN} model is intentionally not carried forward.

Commits

  • 8a4662e7423f2b1c4250d773c6e25a4b136418b7 - NVIDIA: VR: SAUCE: tegra: bpmp: Move channel, resource init to helper
  • fd9097f38b08fe890012e3e97120935dc174047f - NVIDIA: VR: SAUCE: tegra: bpmp: Add ACPI support
  • 758f258afbfa39891b95de08226702d7bcb89391 - NVIDIA: VR: SAUCE: tegra: bpmp: Add sysfs for memory bandwidth QoS

Validation

LTS runtime validation on comp016 was completed on the paired LTS branch:

  • Booted 7.0.0-sudeep-test-dgx16100-v4.

  • mbwt_control present under both NVDA3001:00 and NVDA3001:01 platform devices.

  • V4 sysfs unit test: PASS=16 FAIL=0 SKIP=1.

  • fio throttle sweep on /dev/nvme0n1, socket 00:

    • 100 GB/s baseline: READ: bw=13.3GiB/s (14.3GB/s).
    • 50 GB/s: READ: bw=13.8GiB/s (14.8GB/s).
    • 10 GB/s: READ: bw=9344MiB/s (9798MB/s).
    • 1 GB/s: READ: bw=972MiB/s (1019MB/s).
    • 100 GB/s recovery: READ: bw=13.8GiB/s (14.8GB/s).
  • These values match the PR352 throttle behavior: high limits do not cap read bandwidth, 10 GB/s caps near 10 GB/s, 1 GB/s caps near 1 GB/s, and recovery returns to uncapped bandwidth.

  • make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- O=/tmp/build-dgx16100-bos olddefconfig

  • make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- O=/tmp/build-dgx16100-bos -j$(nproc) drivers/firmware/tegra/

  • Result: exit 0.

  • Warning observed: security/apparmor/Kconfig:120:warning: multi-line strings not supported during olddefconfig.

  • The previous known tegra_bpmp_transfer_acpi frame-size warning did not appear in the final targeted build output.

Debugfs note: the first LTS runtime run observed the known duplicate BPMP debugfs warning (debugfs: 'bpmp' already exists in '/', tegra-bpmp NVDA3001:01: debugfs initialization failed: -12). The current commits include the per-device debugfs fix, build-verified.

kobak2026 added 3 commits May 15, 2026 00:32
Refactor the BPMP driver by moving channel initialization and Device
Tree resource parsing into separate helper functions. This prepares the
driver for ACPI support, where these helpers will be skipped because
channel initialization is handled by ACPI AML methods on ACPI-based
systems.

Signed-off-by: Aniruddha Rao <anrao@nvidia.com>

(backported from V4 internal mail <20260423140823.2848045-2-anrao@nvidia.com>)

Signed-off-by: Koba Ko <kobak@nvidia.com>
This patch adds required changes in the Tegra BPMP driver to make it
compatible with ACPI based platforms.
On ACPI systems, IPC is handled through the AML method instead of the
core kernel framework using Mailboxes and IVC.
Bypass clock, reset and powergate init calls as these are not controlled
by the Linux drivers on ACPI based systems.

Signed-off-by: Aniruddha Rao <anrao@nvidia.com>

(backported from V4 internal mail <20260423140823.2848045-3-anrao@nvidia.com>)
[kobak: Add !ACPI_HANDLE(bpmp->dev) guard around bpmp->soc->ops->init
 because ACPI match driver_data=0 makes bpmp->soc NULL; make BPMP
 debugfs directory per-device on ACPI systems to avoid duplicate
 /sys/kernel/debug/bpmp collision on dual NVDA3001 instances.]

Signed-off-by: Koba Ko <kobak@nvidia.com>
Tegra410 exposes memory bandwidth QoS for PCIe and GPU UPHY traffic on the
path to DRAM. Each bandwidth group can cap PCIe read, PCIe write, or
combined GPU UPHY read and write traffic, with target limits.

The memory bandwidth QoS is not exposed as ordinary host MMIO and
cannot be controlled from the kernel. The bandwidth limits can be programmed
by sending the corresponding requests (MBWT MRQ) to the BPMP.

On Tegra410, an ACPI-based platform, Linux BPMP driver does not use the
device-tree mailbox path for communicating with the BPMP firmware.
As a result, there is no existing client driver or interface that can be
used to send the memory bandwidth requests to the BPMP.

This patch exposes a sysfs directory mbwt_control on the tegra-bpmp platform
device with pcie_instance_id, vc_type, and bandwidth.
Writing bandwidth issues an MBWT_SET for the selected group (pcie_instance_id)
and traffic class (vc_type).
A read issues MBWT_GET and returns pcie_instance_id, vc_type, and the
bandwidth value reported by firmware.

These attributes are exposed only if MBWT QUERY probe reports both
MBWT_SET and MBWT_GET commands as supported.

ABI documented in Documentation/ABI/testing/sys-platform-tegra-bpmp

Signed-off-by: Aniruddha Rao <anrao@nvidia.com>

(backported from V4 internal mail <20260423140823.2848045-4-anrao@nvidia.com>)
[kobak: Keep functional MRQ_SOCHUB_MBWT ABI definitions and sysfs
 interface from V4; condense verbose per-field ABI comments while
 preserving enum/struct layout and Documentation/ABI coverage.]

Signed-off-by: Koba Ko <kobak@nvidia.com>
@kobak2026 kobak2026 requested review from clsotog, fyu1 and nvmochs May 14, 2026 17:03
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 14, 2026

PR Validation Report

Patchscan ✅ No Missing Fixes

All cherry-picked commits checked — no missing upstream fixes found.

PR Lint ❌ Errors found

Details
Checking 3 commits...

Cherry-pick digest:
┌──────────────┬──────────────────────────────────────────────────────────────────┬────────────┬─────────┬───────────────────────────┐
│ Local        │ Referenced upstream / Patch subject                              │ Patch-ID   │ Subject │ SoB chain                 │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ 758f258afbfa │ [SAUCE] tegra: bpmp: add sysfs for memory bandwidth qos          │ N/A        │ N/A     │ anrao, kobak              │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ fd9097f38b08 │ [SAUCE] tegra: bpmp: add acpi support                            │ N/A        │ N/A     │ anrao, kobak              │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ 8a4662e7423f │ [SAUCE] tegra: bpmp: move channel, resource init to helper       │ N/A        │ N/A     │ anrao, kobak              │
└──────────────┴──────────────────────────────────────────────────────────────────┴────────────┴─────────┴───────────────────────────┘

Lint: all checks passed.

PR metadata:
E: PR targets 26.04_linux-nvidia-bos but body has no https://bugs.launchpad.net/... link

@clsotog
Copy link
Copy Markdown
Collaborator

clsotog commented May 14, 2026

hey koba can you check this compile warnings
drivers/firmware/tegra/bpmp.c: In function ‘tegra_bpmp_transfer_acpi’:
drivers/firmware/tegra/bpmp.c:324:13: warning: unused variable ‘i’ [-Wunused-variable]
324 | u32 i;
| ^
drivers/firmware/tegra/bpmp.c:389:1: warning: the frame size of 4096 bytes is larger than 1024 bytes [-Wframe-larger-than=]
389 | }

@nirmoy
Copy link
Copy Markdown
Collaborator

nirmoy commented May 14, 2026

Boro review

Latest watcher review: open review

Head: 758f258afbfa

This comment is maintained by nv-pr-bot. It is updated when the GitHub watcher publishes a newer review.

@nvmochs
Copy link
Copy Markdown
Collaborator

nvmochs commented May 15, 2026

@kobak2026

Some findings from review with Codex:

  1. ACPI bpmp->soc NULL derefs
    In ACPI probe, device_get_match_data() returns NULL because the ACPI match entry uses .driver_data = 0 at drivers/firmware/tegra/bpmp.c:1038. Some init paths were guarded with !ACPI_HANDLE(), but not all later SoC-op
    uses were.

    Risky paths:

    • Probe failure jumps to deinit: and calls bpmp->soc->ops->deinit at drivers/firmware/tegra/bpmp.c:945.
    • Resume calls bpmp->soc->ops->resume at drivers/firmware/tegra/bpmp.c:966.

    Suggested fix: guard these like the init path, for example if (bpmp->soc && bpmp->soc->ops->deinit) and same for resume, or branch ACPI resume/deinit explicitly.

  2. Short ACPI BMRQ response can leak/corrupt data
    tegra_bpmp_transfer_acpi() validates that the firmware response is not too large, but it does not validate that it is large enough for the caller’s requested RX size.

    Problem sequence:

    • ACPI buffer length is read into rdata_len at drivers/firmware/tegra/bpmp.c:363.
    • The code checks only rbuf_len > sizeof(pkg) at drivers/firmware/tegra/bpmp.c:365.
    • It then copies msg->rx.size bytes from pkg.data at drivers/firmware/tegra/bpmp.c:383.

    If AML returns fewer bytes than msg->rx.size, the tail of pkg.data was never initialized by acpi_extract_package(). Suggested fix: before copying, reject rdata_len < msg->rx.size.

  3. MBWT sysfs write ignores BPMP command failure
    tegra_sochub_set_mbwt() only checks whether tegra_bpmp_transfer() itself failed at drivers/firmware/tegra/bpmp-tegra-sysfs.c:94. It never checks msg.rx.ret.

    That means firmware can reject CMD_SOCHUB_MBWT_SET_BW, but the sysfs write still returns count, so userspace sees success. This is especially relevant because ABI text says firmware validates instance, vc_type, and
    bandwidth. Suggested fix: mirror the GET path and return an error if msg.rx.ret < 0.

  4. Large stack frame in ACPI transfer
    tegra_bpmp_transfer_acpi() declares struct tegra_bpmp_acpi_message pkg on the stack at drivers/firmware/tegra/bpmp.c:321. That struct contains a 3960-byte data buffer at drivers/firmware/tegra/bpmp-private.h:29.

    This will likely trigger frame-size warnings on configs with CONFIG_FRAME_WARN=1024 or 2048, and it is also undesirable in a common transfer path. Suggested fix: avoid acpi_extract_package() into that large fixed
    struct, or allocate the temporary extraction buffer with kmalloc() sized from rbuf_len.


As a side note, didn't we address some of these same issues last time in the version we took into the v6.17 kernel? We need to understand why Aniruddha is not carrying these fixes forward.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants