accel/amdxdna: update aie4_msg_priv.h and callers for protocol major v6 fw#1343
accel/amdxdna: update aie4_msg_priv.h and callers for protocol major v6 fw#1343hlaccabu wants to merge 2 commits into
Conversation
There was a problem hiding this comment.
Pull request overview
Updates the NPU3/AIE4 management and host-queue protocol definitions to align with protocol major v6 firmware, along with corresponding firmware metadata bumps.
Changes:
- Bump NPU3 firmware/cert package versions and UMQ version references in
tools/info.json. - Update UMQ host queue packet header layout in the shim and adjust debug dumping.
- Update kernel AIE4 private message opcodes/structs and bump NPU3 protocol support to major v6.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| tools/info.json | Updates referenced firmware binaries and UMQ version hash. |
| src/shim/umq/hwq.cpp | Adjusts UMQ queue dump output for updated common header fields. |
| src/shim/umq/host_queue.h | Updates shim-side UMQ common header layout for the new protocol. |
| drivers/accel/amdxdna/npu3_regs.c | Bumps supported protocol major/minor for NPU3 firmware feature gating. |
| drivers/accel/amdxdna/aie4_msg_priv.h | Updates AIE4 private message opcodes and payload layouts for protocol v6. |
| drivers/accel/amdxdna/aie4_message.c | Switches kernel message call sites to renamed/updated AIE4 protocol v6 messages. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| shim_debug("==========slot %u==========", i); | ||
| shim_debug("\ttype:\t\t%u", static_cast<uint16_t>(pkt->xrt_header.common_header.type)); | ||
| shim_debug("\tbarrier:\t%u", static_cast<uint16_t>(pkt->xrt_header.common_header.barrier)); | ||
| shim_debug("\tacquire:\t%u", static_cast<uint16_t>(pkt->xrt_header.common_header.acquire_fence_scope)); | ||
| shim_debug("\trelease:\t%u", static_cast<uint16_t>(pkt->xrt_header.common_header.release_fence_scope)); | ||
| shim_debug("\ttype:\t\t%u", pkt->xrt_header.common_header.type); | ||
| shim_debug("\topcode:\t\t%u", pkt->xrt_header.common_header.opcode); | ||
| shim_debug("\tcount:\t\t%u", pkt->xrt_header.common_header.count); | ||
| shim_debug("\tdistribute:\t%u", pkt->xrt_header.common_header.distribute); |
| { .major = 6, .min_minor = 1 }, | ||
| { .features = BIT_U64(AIE4_GET_COREDUMP), .major = 6, .min_minor = 0 }, | ||
| { .features = BIT_U64(AIE4_RW_ACCESS), .major = 6, .min_minor = 0 }, | ||
| { .features = BIT_U64(AIE4_CALIBRATE_CLOCK), .major = 6, .min_minor = 0 }, |
There was a problem hiding this comment.
Why do we need change the version check?
There was a problem hiding this comment.
None of those features would be enabled if their majors were left at 5, aie_check_protocol() requires exact major version match to enable features, not >=. Since they are all supported in v6, should I just remove these checks instead?
There was a problem hiding this comment.
The rule is to add version feature check together as the code change for adding this feature.
I am wondering how those feautes (core_dump, rw_access and calibrate_clock) are tested for aie4? My best guess is that those are not enabled, then please remove those check now.
Only add those when the feature is introduced with the same PR. Do not mix them, it is very hard for upstreaming patch maintainence.
| ret = aie_send_mgmt_msg_wait(&ndev->aie, &msg); | ||
| if (ret) | ||
| if (ret) { | ||
| XDNA_ERR(xdna, "Get coredump got status 0x%x", resp.status); | ||
|
|
||
| for (int i = 0; i < ARRAY_SIZE(resp.error_detail); i++) { | ||
| XDNA_ERR(xdna, "Error detail %d: 0x%08x", i, resp.error_detail[i]); | ||
| } | ||
| } |
…v6 fw Signed-off-by: Hayden Laccabue <hlaccabu@amd.com>
Signed-off-by: Hayden Laccabue <hlaccabu@amd.com>
| /* Classic/PF/VF common */ | ||
| AIE4_MSG_OP_IDENTIFY = 0x10002, | ||
| AIE4_MSG_OP_SUSPEND = 0x10003, | ||
| AIE4_MSG_OP_CALIBRATE_CLOCK = 0x10009, |
There was a problem hiding this comment.
This is not acceptable to me. Why remove old op codes?
I think the old op code remain there for older fw version.
For newer fw version, we should add feature table to support newer opcode.
I will take over this PR and drill into the details.
There was a problem hiding this comment.
The idea is that this version of FW is not backward compatible with old ones. The firmware team wants to start from scratch. I think this is OK since there has been no official release of the FW yet.
No description provided.