Skip to content

Conversation

@pbalcer
Copy link
Contributor

@pbalcer pbalcer commented Dec 5, 2025

L0v2 avoids internally tracking each kernel submission through an event for lifetime management. Instead, when a kernel is submitted to the queue, its handle is added to a vector, to be removed at the next queue synchronization point, urQueueFinish(). This is a much more efficient way of handling kernel tracking, since it avoids taking and storing an event. However, if the application never synchronizes the queue, this vector of submitted kernels will grow unbounded.

This patch forcibly synchronizes the queue once the submitted kernels vector reaches a threshold.

@pbalcer pbalcer requested a review from a team as a code owner December 5, 2025 21:39
@pbalcer
Copy link
Contributor Author

pbalcer commented Dec 5, 2025

I've created a PR against 6.2, since this functionality changed a little in the latest version, and will require a different patch.

Copy link
Contributor

@nrspruit nrspruit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

@pbalcer pbalcer force-pushed the cleanup-large-submitted-kernels branch from 753bc9e to f99e1cd Compare December 5, 2025 22:09
@pbalcer pbalcer marked this pull request as draft December 5, 2025 23:59
@pbalcer pbalcer force-pushed the cleanup-large-submitted-kernels branch from f99e1cd to 57aef87 Compare December 6, 2025 09:15
@pbalcer pbalcer force-pushed the cleanup-large-submitted-kernels branch from 57aef87 to 156235d Compare December 6, 2025 09:39
@pbalcer pbalcer deployed to WindowsCILock December 6, 2025 09:40 — with GitHub Actions Active
L0v2 avoids internally tracking each kernel submission through
an event for lifetime management. Instead, when a kernel is submitted
to the queue, its handle is added to a vector, to be removed at the
next queue synchronization point, urQueueFinish(). This is a much more
efficient way of handling kernel tracking, since it avoids taking and
storing an event. However, if the application never synchronizes the
queue, this vector of submitted kernels will grow unbounded.

This patch forcibly synchronizes the queue once the submitted kernels
vector reaches a threshold.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants