Improve insert_point#90
Conversation
|
Awesome job! Do you have any timing comparisons between the old version and this new one? Otherwise, the biggest issue is that that https://github.com/storpipfugl/pykdtree/blob/master/pykdtree/_kdtree_core.c.mako |
|
I'd recommend avoiding building using a mako file and instead relying on |
|
I have a lot of experience with cython and a decent amount with pyproject.toml. I think C++ templating is probably the way this project should go in the long run to keep the internal functions as small and low-level as possible. I've found that when putting nearly pure-C/C++ functions in Cython that they end up getting some "cruft" that you wouldn't otherwise get if you wrote it "by hand". The last time I played with this was probably 8 or so years ago though so I'm sure it is overall better now...but I also don't want to rewrite that entire C template file in Cython. And while pyproject.toml would be a good idea for defining the build environment I think with the Cython extensions we would at least need some minimal version of a setup.py. |
Good point regarding the "cruft." I've found it as well and I'll add it to the |
|
Whoa whoa whoa. Sorry. I didn't realize when I wrote my last comment that you rewrote the entire build system already. We really need to keep these PRs separate. I'm still not sold on completely rewriting this entire package just because. I know the modern standards have changed, but besides auto-running the Could you please return this PR back to the changes you made before, but only in the mako file? Optionally you could run the render script to generate the |
|
I'll totally revert. I see now how it's advantageous to use the mako file for templating. I do feel that it's best to use pyproject and a setup.py with Cython rather than shipping the autogenerated source from Cython, but I fully agree that it should be in a separate PR. |
|
Yes the distributed Cython C code is a problem. It used to be a suggestion by some in the past (maybe even Cython docs themselves) to avoid including a dependency on Cython during installation. That's changed in the last 5 or so years (pyproject.toml sandbox build environments help) as it now hinders downstream build systems (ex. conda-forge) from automatically getting the newest/best version of the Thanks for understanding. |
|
What are the chances you could revert your changes so we can get this merged? I'm working on #98 which includes some of the stuff we talked about (removing the Cython generated C file) but keeps the mako template and generated |
|
Why the close? |
Sorry, the PR was stale. If you're still interested I'll work on it this month if I can fit it into my schedule. |
|
Thanks. I think my main concern was that you just went a little too far on the refactor. If you revert a couple commits as discussed above (from what I remember) then it should be good to go. |
The old code in
insert_point_doublefunction uses a for loop that iterates through the elements fromk-1to 1 and then compares thecur_distwithclosest_dist[i - 1]. If the condition is true, it performs an assignment operation, which essentially shifts elements in the array. This means it will take a longer time when more elements satisfy this condition (i.e., when the list is less sorted). Moreover, the loop has multiple if-else branches, adding extra overhead.In this PR
insert_point_doublefunction first assigns thecur_distandpidxvalues to the last element in theclosest_distandclosest_idxarrays. It then utilizes a single for loop that iterates fromk-1to 1, comparingclosest_dist[j - 1]withclosest_dist[j]. If the condition is met, it performs a "bubbling" operation: swapping the elements at the current positionjand the position before itj-1, using temporary variablestmp_distandtmp_idx. This means that it only modifies the elements in the list if necessary, reducing the number of assignment operations required.In summary, the new code is more efficient because it reduces the number of assignment operations by using a "bubbling" approach instead of always shifting elements. Additionally, it simplifies the loop structure by avoiding multiple branches, resulting in improved performance. The time complexity is generally the same for both codes when considering the worst-case scenario (O(k)), but the new code performs better on average due to its more efficient implementation. The space complexity is constant for both implementations, but the new code utilizes fewer variables, making it slightly more efficient in terms of space usage.
Extras
This PR modifies several functions to become
inlineand makes several parametersconst.