Skip to content

accel/amdxdna: update aie4_msg_priv.h and callers for protocol major v6 fw#1343

Open
hlaccabu wants to merge 2 commits into
amd:mainfrom
hlaccabu:fw
Open

accel/amdxdna: update aie4_msg_priv.h and callers for protocol major v6 fw#1343
hlaccabu wants to merge 2 commits into
amd:mainfrom
hlaccabu:fw

Conversation

@hlaccabu
Copy link
Copy Markdown
Contributor

No description provided.

@hlaccabu hlaccabu requested a review from houlz0507 as a code owner May 20, 2026 20:05
Copilot AI review requested due to automatic review settings May 20, 2026 20:05
@hlaccabu hlaccabu requested review from maxzhen and xdavidz as code owners May 20, 2026 20:05
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the NPU3/AIE4 management and host-queue protocol definitions to align with protocol major v6 firmware, along with corresponding firmware metadata bumps.

Changes:

  • Bump NPU3 firmware/cert package versions and UMQ version references in tools/info.json.
  • Update UMQ host queue packet header layout in the shim and adjust debug dumping.
  • Update kernel AIE4 private message opcodes/structs and bump NPU3 protocol support to major v6.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
tools/info.json Updates referenced firmware binaries and UMQ version hash.
src/shim/umq/hwq.cpp Adjusts UMQ queue dump output for updated common header fields.
src/shim/umq/host_queue.h Updates shim-side UMQ common header layout for the new protocol.
drivers/accel/amdxdna/npu3_regs.c Bumps supported protocol major/minor for NPU3 firmware feature gating.
drivers/accel/amdxdna/aie4_msg_priv.h Updates AIE4 private message opcodes and payload layouts for protocol v6.
drivers/accel/amdxdna/aie4_message.c Switches kernel message call sites to renamed/updated AIE4 protocol v6 messages.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread drivers/accel/amdxdna/npu3_regs.c Outdated
Comment thread drivers/accel/amdxdna/aie4_msg_priv.h Outdated
Comment thread drivers/accel/amdxdna/aie4_msg_priv.h Outdated
Comment thread src/shim/umq/hwq.cpp
Comment on lines 100 to 104
shim_debug("==========slot %u==========", i);
shim_debug("\ttype:\t\t%u", static_cast<uint16_t>(pkt->xrt_header.common_header.type));
shim_debug("\tbarrier:\t%u", static_cast<uint16_t>(pkt->xrt_header.common_header.barrier));
shim_debug("\tacquire:\t%u", static_cast<uint16_t>(pkt->xrt_header.common_header.acquire_fence_scope));
shim_debug("\trelease:\t%u", static_cast<uint16_t>(pkt->xrt_header.common_header.release_fence_scope));
shim_debug("\ttype:\t\t%u", pkt->xrt_header.common_header.type);
shim_debug("\topcode:\t\t%u", pkt->xrt_header.common_header.opcode);
shim_debug("\tcount:\t\t%u", pkt->xrt_header.common_header.count);
shim_debug("\tdistribute:\t%u", pkt->xrt_header.common_header.distribute);
maxzhen
maxzhen previously approved these changes May 20, 2026
Comment thread drivers/accel/amdxdna/aie4_msg_priv.h Outdated
Comment thread drivers/accel/amdxdna/aie4_msg_priv.h Outdated
Comment thread drivers/accel/amdxdna/aie4_msg_priv.h Outdated
Comment thread drivers/accel/amdxdna/aie4_msg_priv.h Outdated
Comment thread drivers/accel/amdxdna/aie4_msg_priv.h Outdated
Comment thread drivers/accel/amdxdna/aie4_message.c Outdated
Comment thread drivers/accel/amdxdna/aie4_msg_priv.h Outdated
{ .major = 6, .min_minor = 1 },
{ .features = BIT_U64(AIE4_GET_COREDUMP), .major = 6, .min_minor = 0 },
{ .features = BIT_U64(AIE4_RW_ACCESS), .major = 6, .min_minor = 0 },
{ .features = BIT_U64(AIE4_CALIBRATE_CLOCK), .major = 6, .min_minor = 0 },
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need change the version check?

Copy link
Copy Markdown
Contributor Author

@hlaccabu hlaccabu May 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

None of those features would be enabled if their majors were left at 5, aie_check_protocol() requires exact major version match to enable features, not >=. Since they are all supported in v6, should I just remove these checks instead?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rule is to add version feature check together as the code change for adding this feature.

I am wondering how those feautes (core_dump, rw_access and calibrate_clock) are tested for aie4? My best guess is that those are not enabled, then please remove those check now.

Only add those when the feature is introduced with the same PR. Do not mix them, it is very hard for upstreaming patch maintainence.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.

Comment on lines 196 to +202
ret = aie_send_mgmt_msg_wait(&ndev->aie, &msg);
if (ret)
if (ret) {
XDNA_ERR(xdna, "Get coredump got status 0x%x", resp.status);

for (int i = 0; i < ARRAY_SIZE(resp.error_detail); i++) {
XDNA_ERR(xdna, "Error detail %d: 0x%08x", i, resp.error_detail[i]);
}
}
hlaccabu added 2 commits May 21, 2026 13:12
…v6 fw

Signed-off-by: Hayden Laccabue <hlaccabu@amd.com>
Signed-off-by: Hayden Laccabue <hlaccabu@amd.com>
/* Classic/PF/VF common */
AIE4_MSG_OP_IDENTIFY = 0x10002,
AIE4_MSG_OP_SUSPEND = 0x10003,
AIE4_MSG_OP_CALIBRATE_CLOCK = 0x10009,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not acceptable to me. Why remove old op codes?
I think the old op code remain there for older fw version.

For newer fw version, we should add feature table to support newer opcode.

I will take over this PR and drill into the details.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is that this version of FW is not backward compatible with old ones. The firmware team wants to start from scratch. I think this is OK since there has been no official release of the FW yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants