|
hkv_table_->find_or_insert(num_keys, keys, values, nullptr, stream); |
During the adaptation process for Warp64 architecture chips, the find_ptr_or_insert CUDA kernel experiences severe blocking behavior, resulting in significant performance degradation and system instability.
In some cases, causes entire CUDA context to hang
HugeCTR/sparse_operation_kit/kit_src/variable/impl/hkv_variable.cu
Line 427 in 2f2016f
During the adaptation process for Warp64 architecture chips, the find_ptr_or_insert CUDA kernel experiences severe blocking behavior, resulting in significant performance degradation and system instability.
In some cases, causes entire CUDA context to hang