Skip to content

Commit 3191462

Browse files
authored
vulkan: improve partial offloading performance on AMD (ggml-org#19976)
* vulkan: fix and enable cpy_tensor_async function * use transfer_queue for async transfers on AMD, synchronize with timeline semaphore * update offload_op logic * fix missing transfer submission * disable async transfer queue on AMD GCN * revert op batch size change * fix cpy_tensor_async checks
1 parent 66d65ec commit 3191462

1 file changed

Lines changed: 177 additions & 86 deletions

File tree

0 commit comments

Comments
 (0)