Feature/umq fast path#137
Open
vkommine wants to merge 3 commits into
Open
Conversation
added 3 commits
May 18, 2026 22:18
- UAPI copy: CMDQ_INFO, UMQ_ENABLE, UMQ_DISABLE ioctls + mmap offsets
- vpu_command_queue.cpp: VPUDeviceQueueUMQ class
- tryCreate(): CMDQ_INFO + UMQ_ENABLE + mmap ring + mmap doorbell
- submitCommandBuffer(): ring write + doorbell MMIO + setUmqMode(true)
- ringJob(): sfence-ordered ring buffer write + doorbell ring
- checkReset(): reset_counter detection
- vpu_command_queue.hpp: VPUDeviceQueueUMQ declaration
- vpu_device.cpp: UMQ queue creation path
- vpu_driver_api.cpp/hpp: commandQueueGetInfo, commandQueueUmqEnable/Disable
- command_buffer.cpp: waitForCompletion() ULLS-Light pattern
- UMQ mode: umonitor + umwait(C0.2, 16000 cycles) + yield between iters
matches Intel GPU compute-runtime WaitUtils (wait_util.h)
- Non-UMQ mode: unchanged (busyWait 15ms cap + ioctl)
- command_buffer.hpp: setUmqMode(), umqMode flag
Async benchmark (1M iter, cpu10 @ 2GHz, TILES=1):
BS4 nireq=1: -24% median latency, +30% FPS vs baseline
BS8 nireq=1: -22% median latency, +24% FPS vs baseline
BS10 nireq=1: -20% median latency, +19% FPS vs baseline
nireq=4: ~9% FPS improvement across all batch sizes
Phase 3 of persistent CmdQ PoC (PTL-SUT-0144 / NPU5010): - ivpu_accel.h (uapi): add DRM_IVPU_CMDQ_FLAG_PERSISTENT 0x00000002u; add __u32 _pad field to drm_ivpu_cmdq_create for ABI alignment - vpu_driver_api.hpp: extend commandQueueCreate() with isPersistent param - vpu_driver_api.cpp: set DRM_IVPU_CMDQ_FLAG_PERSISTENT in createArgs.flags when isPersistent=true - vpu_command_queue.cpp: pass isPersistent=umqCapability when creating the UMQ command queue so persistent flag is set on the hot path The persistent flag instructs the KMD/FW to skip per-submission CmdQ setup, eliminating ~8-10µs of FW FSM overhead per inference. Latency results (PTL-SUT-0144, Mar5 RC FW, cpu10@2GHz, 50K iters): MLP15_b4: -8.40µs (-14.1%) MLP15_b8: -8.77µs (-14.5%) MLP15_b10: -9.83µs (-16.1%)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.