For example, when sampling an array A of size (4, 4, 4), with count of size (4, 4) and prob of size (4,), use an index i to select among the CartesianIndices((4,4)) and an index j to select among CartesianIndices((4,)), and iterate over the remaining indices (in this case, just the third axis of A) within the kernel. This avoids recomputing the constants that are needed for the BTRS algorithm.
This should probably only be merged if it is consistently faster, or else maybe be a user option avoid_recomp.
For example, when sampling an array
Aof size(4, 4, 4), withcountof size(4, 4)andprobof size(4,), use an indexito select among theCartesianIndices((4,4))and an indexjto select amongCartesianIndices((4,)), and iterate over the remaining indices (in this case, just the third axis ofA) within the kernel. This avoids recomputing the constants that are needed for the BTRS algorithm.This should probably only be merged if it is consistently faster, or else maybe be a user option
avoid_recomp.