This repository was archived by the owner on Apr 29, 2026. It is now read-only.
Commit 504ac72
committed
Add CUDA bitset matrix support in
- Introduced CUDA helper functions for managing a row-major u64 matrix, enhancing performance for masked popcount and any checks in QM/Espresso scoring.
- Implemented RAII-style management for CUDA bitset matrices to ensure proper resource handling.
- Added CUDA kernel functions for row-wise popcount and any checks, optimizing bitset operations on the device.
- Updated existing logic to utilize the new CUDA capabilities, improving synthesis efficiency and flexibility.pe_synth.h and pe_synth_cuda_u64_cones.cu
1 parent 9aebe8f commit 504ac72
2 files changed
Lines changed: 759 additions & 26 deletions
0 commit comments