-
Notifications
You must be signed in to change notification settings - Fork 102
[ENH] Refactor sparse.py type hierarchy #737
Copy link
Copy link
Open
Labels
Effort > Moderate 🐕Mid-sized tasks estimated to take a few days to a few weeks.Mid-sized tasks estimated to take a few days to a few weeks.Impact > Moderate 🔶User-visible but non-breaking change. Treated like a minor version bump (e.g., 0.6.5 → 0.7.0).User-visible but non-breaking change. Treated like a minor version bump (e.g., 0.6.5 → 0.7.0).Refactor 🔧Code level improvements that restructure existing logic without changing behavior.Code level improvements that restructure existing logic without changing behavior.
Metadata
Metadata
Assignees
Labels
Effort > Moderate 🐕Mid-sized tasks estimated to take a few days to a few weeks.Mid-sized tasks estimated to take a few days to a few weeks.Impact > Moderate 🔶User-visible but non-breaking change. Treated like a minor version bump (e.g., 0.6.5 → 0.7.0).User-visible but non-breaking change. Treated like a minor version bump (e.g., 0.6.5 → 0.7.0).Refactor 🔧Code level improvements that restructure existing logic without changing behavior.Code level improvements that restructure existing logic without changing behavior.
Description
I don't think we'll be able to cleanly solve #736 without reorganizing the type hierarchy of
chainladder/utils/sparse.py.What I had in mind was a dual typing protocol of
ArrayModule(kind of like an ArrayLibraryLike type) andBackendArray(like numpy's ArrayLike type, but with modifications for interop between the 4 current backend array types to match our use case).However, since the frequently-executed
xp = get_array_module(...)returns either a ModuleType (numpy, dask, cupy) or an array class (sparse.COO) it's not clear which one it'll be when it's called dynamically. That is, it's unknown whether the type hint should be:or
Furthermore, when the backend is sparse,
xpis also callable, but it's not in the case of the other backends. In order to use xp to create an array, you need to do it at different levels of hierarchy depending on the backend:A more consistent implementation of case sparse would be:
If we manage to get a general
ArrayModuleclass defined, we could do something like:where the type of array to be produced is automatically determined by
Triangle.array_backend. This would also avoid the various checks that are in place to make sure the backend is a certain type (usually sparse) before creating the array, and avoid calling the backend-specificndarrayandCOO.Is your feature request aligned with the scope of the package?
Describe the solution you'd like, or your current workaround.
I believe this can be improved by reorganizing the objects in
chainladder/utils/sparse.py. The librarysparsecan be augmented by assigning it the missing top-level functions that are either in numpy, or the ones that are currently defined in the file, whereas the augmented COO-level method calls can be mapped tosparse.COO.I was able to play around with this approach and get the tests to pass, I would like to double-check later and open a PR for feedback.
Do you have any additional supporting notes?
This might not be doable in a 100% clean way, I noticed some COO methods to be overridden by top-level numpy functions, but I do think getting most of the way there will solve #736 and greatly improve the goals of #486.