Skip to content

[ENH] Superclass array modules #736

@genedan

Description

@genedan

Description

There are many places in Chainladder where we assign an array module via:

xp: ModuleType = obj.get_array_module()

The return value of get_array_module varies depending on the backend. It can be:

  • The cupy module (or numpy upon import failure)
  • A sparse COO type
  • The numpy module
  • The dask.array module (or numpy upon import failure)

I have marked the return type of this as ModuleType, which is problematic for the following reasons:

  • One of these (sparse COO) isn't ModuleType, it's just type
  • Even if we fix the return type of get_array_module to ModuleType | type, these are too general for tools like IDEs to inspect the methods that we call, such as:

exponent = xp.nan_to_num(exponent * (y * 0 + 1))

The IDE is unaware of which array module xp happens to be and thus cannot pull up any information on documentation, arguments, return types, etc.

xp is used as if it belonged to a general array backend class with a common set of methods such nan_to_num, with specific implementations depending on whether it happens to be cupy, numpy, sparse, or dask, but since it's not defined, coding tools get lost when trying to inspect it.

Is your feature request aligned with the scope of the package?

  • Yes, absolutely!
  • No, but it's still worth discussing.
  • N/A (this request is not a codebase enhancement).

Describe the solution you'd like, or your current workaround.

Declaring a class such as ArrayBackend with a common set of abstract methods, e.g., nan_to_num, ones, nanmean etc., can solve this, as well as providing a unified return type for get_array_module.

Do you have any additional supporting notes?

This issue may become a moot point if we decide to go the route of creating new unified backend. There's been talk of adopting something like Arrow, or Polars, but I don't recall if it was going to be the single backend or just another option alongside numpy and sparse. But so long as we are switching back and forth between numpy and sparse, a general backend type could solve the typing and inspection issues.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Effort > Serious 🐘Large, complex tasks requiring a few weeks to months of work.Impact > Moderate 🔶User-visible but non-breaking change. Treated like a minor version bump (e.g., 0.6.5 → 0.7.0).

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions