layers: Do not remove substates on Destroy #11501

artem-lunarg · 2026-01-16T18:25:54Z

Instead, remove substates when the object is actually destructed (in the destructor).

This matches the pre-substate behavior, where a state object could retain some data after Destroy (for example, syncval uses this for error messages).

In the post-substate version, Destroy removes registered substates. In the case of syncval, this results in losing resource handle information during submit time validation. Originally syncval command buffer state object stored the list of handles which was referenced in case of error (even if command buffer was destroyed).

Closes #11490

Instead, remove substates when the object is actually destructed (in the destructor). This matches the pre-substate behavior, where a state object could retain some data after Destroy (for example, syncval uses this for error messages). This assumes that there is shared ptr reference that keeps the object alive. In the post-substate version, Destroy removes registered substates, and the above scenario does not work. In the case of syncval, this results in losing resource handle information during submit time validation. Originally syncval command buffer state object stored the list of handles and QueueBatchContext held share_ptr to submitted command buffers.

ci-tester-lunarg · 2026-01-16T18:25:57Z

CI Vulkan-ValidationLayers build queued with queue ID 624744.

artem-lunarg · 2026-01-16T18:26:37Z

Still trying to come up with a test that reproduces that issue, but not that easy... (tested directly on the app)

ci-tester-lunarg · 2026-01-16T18:30:01Z

CI Vulkan-ValidationLayers build # 22191 running.

ci-tester-lunarg · 2026-01-16T19:34:50Z

CI Vulkan-ValidationLayers build # 22191 passed.

artem-lunarg · 2026-01-16T20:08:33Z

I definitely want to reproduce this with a test. I understand that this error happens due to non synchronized writes from two queues (at least that's how vvl detects this) but the crash happens because command buffer is deleted (deletion looks correct, no in-use error). In order to delete command buffer you have to wait for it. Waiting for command buffer prevents the hazard.. I have a reproduciable gfxr capture, so can figure this out sooner or later.

spencer-lunarg · 2026-01-16T21:15:35Z

layers/state_tracker/cmd_buffer_state.cpp

    for (auto &item : sub_states_) {
        item.second->Destroy();
    }
-    sub_states_.clear();


What about in Pipeline::Destroy()?

Please clarify

So currently we have use case with command buffers, it depends on the logic probably if other state object might need this (command buffers we referenced via shares ptr and used after deletion). This results in hard crash so if other objects have such scenarios it will be hard to miss

Why do we now not clear here, but only for pipeline?

I can't tell if this is an issue due to just how Command Buffers (and queues) are the only state we have that we don't derive from CoreChecks

Is it because these are dispatchable handles

I guess if we are clearing it in Pipelines, but not here, curious what is the "rule" to follow

Maybe ideally to have logic that doesn’t need this but syncval relied on this before, so this just gets old behavior (that’s regression after substates)

Why do we now not clear here, but only for pipeline?

Ah, I see, indeed. Not sure why we cleared only for command buffers and pipelines and not for others.

artem-lunarg · 2026-01-19T12:07:24Z

Reproduced the original crash with a test.

Now need to figure out if the scenario from the test is a valid behavior or not (even after this fix). If it's a valid behavior then need to fix additional false positive issue (can be as a separate PR).

MennoVink · 2026-01-20T13:51:47Z

vkQueueSubmit(): WRITE_RACING_WRITE hazard detected. vkCmdPipelineBarrier2KHR[Advect] (from VkCommandBuffer 0x2395bbd1720 submitted on the current VkQueue 0x2394c2385d0[Graphics Queue 0]) writes to VkImage 0x7820000000782, which was previously written during an image layout transition initiated by another vkCmdPipelineBarrier2KHR command (from VkCommandBuffer 0x23956d9f830 submitted on VkQueue 0x2394eb82320[Transfer Queue 0]). 
The current synchronization allows VK_ACCESS_2_TRANSFER_READ_BIT accesses at VK_PIPELINE_STAGE_2_COPY_BIT|VK_PIPELINE_STAGE_2_RESOLVE_BIT|VK_PIPELINE_STAGE_2_BLIT_BIT|VK_PIPELINE_STAGE_2_CONVERT_COOPERATIVE_VECTOR_MATRIX_BIT_NV, VK_ACCESS_2_INDIRECT_COMMAND_READ_BIT accesses at VK_PIPELINE_STAGE_2_COPY_INDIRECT_BIT_KHR, but layout transition does synchronize with these accesses.
Vulkan insight: If the layout transition is done via an image barrier, ensure srcStageMask and srcAccessMask synchronize with the accesses mentioned above. If the transition occurs as part of the render pass begin operation, consider specifying an external subpass dependency (VK_SUBPASS_EXTERNAL) with srcStageMask and srcAccessMask that synchronize with those accesses, or perform the transition in a separate image barrier before the render pass begins.

This is from a different example

Is this the non synchronized writes issue you're talking about? Could this be because QFOT not being supported by syncval? #8112.
It seems to validate against the transfer queue's release barrier. At the very least there's a semaphore into compute/graphics queue acquire barrier into fence in between.
Perhaps this is just a textual error (it sees the release barrier on transfer queue first, and doesn't override it by release barrier on graphics queue). If so then the question becomes if we're able to get WAW errors if we've fenced on the command buffer that semaphored on the command buffer that did the previous write.

The interesting part here is that my acquire barrier uses AllCommands + MemoryRead. I can imagine this could be specified somewhere as invalid, because i'm going to be writing to the image even though it hasn't been made ready for writing yet.

artem-lunarg · 2026-01-20T14:02:57Z

@MennoVink I found the root cause of the issue, that's in syncval internals (we also have internal test that reproduces the same behavior as in the app). Now we need to fix one part of the validation. The crash itself is not the root cause (e.g. that there is no check for null at that point). If false positive is fixed, that crash will go away and also no false positive error message. I'm trying to come up with a solution during this week.

p.s. that's not about QFOT mostly about internal mechanism to track accesses from different queues in general.

spencer-lunarg reviewed Jan 16, 2026

View reviewed changes

jeremyg-lunarg approved these changes Jan 16, 2026

View reviewed changes

layers: Do not remove substates on Destroy #11501

Are you sure you want to change the base?

layers: Do not remove substates on Destroy #11501

Uh oh!

Conversation

artem-lunarg commented Jan 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ci-tester-lunarg commented Jan 16, 2026

Uh oh!

artem-lunarg commented Jan 16, 2026

Uh oh!

ci-tester-lunarg commented Jan 16, 2026

Uh oh!

ci-tester-lunarg commented Jan 16, 2026

Uh oh!

artem-lunarg commented Jan 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

spencer-lunarg Jan 16, 2026

Choose a reason for hiding this comment

Uh oh!

artem-lunarg Jan 16, 2026

Choose a reason for hiding this comment

Uh oh!

artem-lunarg Jan 16, 2026

Choose a reason for hiding this comment

Uh oh!

spencer-lunarg Jan 16, 2026

Choose a reason for hiding this comment

Uh oh!

artem-lunarg Jan 16, 2026

Choose a reason for hiding this comment

Uh oh!

artem-lunarg Jan 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

artem-lunarg commented Jan 19, 2026

Uh oh!

MennoVink commented Jan 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

artem-lunarg commented Jan 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

artem-lunarg commented Jan 16, 2026 •

edited

Loading

artem-lunarg commented Jan 16, 2026 •

edited

Loading

artem-lunarg Jan 16, 2026 •

edited

Loading

MennoVink commented Jan 20, 2026 •

edited

Loading

artem-lunarg commented Jan 20, 2026 •

edited

Loading