perf: welford's algorithm for mean-var aggregation#4147
Conversation
| out[cat, col] += data.data[j] | ||
|
|
||
|
|
||
| @njit |
There was a problem hiding this comment.
We should make these nogil or provide an option for fau to provide nogil njit
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #4147 +/- ##
==========================================
- Coverage 79.61% 79.60% -0.02%
==========================================
Files 120 120
Lines 12786 12780 -6
==========================================
- Hits 10180 10173 -7
- Misses 2606 2607 +1
Flags with carried forward coverage won't be shown. Click here to find out more.
|
There was a problem hiding this comment.
@ilan-gold I reviewed since this plays into our other work
I don't see any issues with the welfords implementation, lgtm!
and overall I'm new with njit, OMP, TBB. But I see your point about nogil, and was able to create some scenarios that trigger that path. So agreed on the suggestion... I also found that it does also slightly improve the chan njit work when fau falls back on the serial build, and I guess it would generally fix any of these types of things across scanpy, so maybe that does belong in fau?
There's probably some threading stuff I'm not fully grasping with scanpy, fau, those libraries yet though; I will be more confident in that in a few weeks. I figure I will revisit threading and numba as a whole as part of building a good understanding of these libraries, and if I find any thread issues throughout scanpy/fau I will raise them at that point.
Yes the issue with
That would be amazing because it is something we have (clearly) struggled with |
From discussions with @zboldyga.
This should then in theory be reused with #4143 instead of its custom moments calculation