Skip to content

Method to query if active storage reductions are available. #56

@davidhassell

Description

@davidhassell

When incorporating active storage into Dask, the Dask graph needs to be modified non-lazily to account for the fact that some of the work is being done externally to Dask, i.e. on the server where the data is [*].

If cf-python thinks that active storage operation are not possible, then it simply doesn't modify the graph. There are many reasons why active storage is not deemed OK, such as the data has already been operated on (f += 2), the chunks to not point to files on disk, but the relevant one here is that the file reside at a location that doesn't support active reductions.

So, it would be great to have a method of Active that can tell us if a given file can be reduced actively, something like (notional API):

>>> a = Active('/path/to/file.nc')
>>> a.isactive()
True # or False

I imagine that this could entail some try ... except ... approach whereby we assume the file is active, send off some mofided URI that returns True iff it is possible.

[*] (The detail of this is that the chunk reduction function used by dask.array.reduction needs to be changed from the usual function that expects to do some work (e.g. np.max) to the identity function that does no work (e.g. lambda x: x). This has to be done prior to the compute().)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions