Skip to content

New keyword parameter to cf.Field.regrids and cf.Field.regridc: max_masked#950

Open
davidhassell wants to merge 6 commits into
NCAS-CMS:mainfrom
davidhassell:linear-weights-2
Open

New keyword parameter to cf.Field.regrids and cf.Field.regridc: max_masked#950
davidhassell wants to merge 6 commits into
NCAS-CMS:mainfrom
davidhassell:linear-weights-2

Conversation

@davidhassell

Copy link
Copy Markdown
Collaborator

Fixes #949

All the excitement happens in cf/data/dask_regrid.py. The rest is just passing the new max_masked parameter about, docs and tests :)

@davidhassell davidhassell added this to the NEXTVERSION milestone May 29, 2026
@davidhassell davidhassell added enhancement New feature or request regridding Relating to regridding operations labels May 29, 2026

@sadielbartholomew sadielbartholomew left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aside from some typos it is perfect, so please review those and then merge at will.

That said, I wanted to comment regarding API consistency - though it relates more to the issue at hand and your proposed way to handle it (i.e. #949) rather than the implementation of it here. Namely I notice we have the keyword 'mtol' for Field.collapse and the various convenience methods for collapses (mean, integral etc.) and that aligns very closely with the new max_masked keyword introduced, given its description:

Set the fraction of input data elements which is allowed to contain missing data when contributing to an individual output data element. Where this fraction exceeds mtol, missing data is returned. The default is 1, meaning that a missing datum in the output array occurs when its contributing input array elements are all missing data. A value of 0 means that a missing datum in the output array occurs whenever any of its contributing input array elements are missing data. Any intermediate value is permitted.

They have different names but (by my understanding) effectively control the same thing (input masking to output masking) - bar the slight complication of the role of any weighting used for the regrid case? But one is specified by number, the other by fraction.

So in future we could consider choosing one over the other for both collapses and regridding (and any other applicable methods).

Comment thread cf/docstring/docstring.py Outdated
Comment thread cf/regrid/regrid.py Outdated
Comment thread cf/data/dask_regrid.py Outdated
Comment thread cf/data/dask_regrid.py Outdated
Comment thread cf/data/dask_regrid.py Outdated
Comment thread cf/data/dask_regrid.py Outdated
Co-authored-by: Sadie L. Bartholomew <sadie.bartholomew@ncas.ac.uk>
@davidhassell

Copy link
Copy Markdown
Collaborator Author

Hi Sadie - thanks for the review.

I'd like to think about the pertinent API question you raise before merging. mtol specified as a fraction in [0,1] has the advantage that for linear regridding from unstructured source grids, there may be different numbers of potentially contributing cells for each destination grid cell. So being able to specify a fraction (0.75, say) means you can play nicely with triangles, squares, pentagons, etc.

This would involve small tweak to the code (to ascertain the number of source grid cells in play, and to convert the fraction to a number of masked points in each case), but that's no more or less fiddly than what we've already got!

I'll take a look ....

@davidhassell

Copy link
Copy Markdown
Collaborator Author

Hi Sadie - PR updated for mtol ... 698b18d

@sadielbartholomew sadielbartholomew left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks David for updating the approach - sorry it was extra work but like you indicated, it provides a more flexible approach than the max_masked original kwarg idea and it means we have a consistent kwarg 'mtol' across multiple supported methods. So, much nicer overall.

I reviewed the new commit and then did a sanity check on the whole PR branch. Overall is all good. Some more typos to correct (just two really, copied and pasted across) and a changelog conflict to resolve, but other than that all ready to merge so please go ahead.

Comment thread cf/data/dask_regrid.py

A destination grid cell j will be masked if *mtol*
multiplied by the total number of source cells i for
which ``w_ji >= min_weight`` is greater then the

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
which ``w_ji >= min_weight`` is greater then the
which ``w_ji >= min_weight`` is greater than the

Comment thread cf/data/dask_regrid.py
Comment on lines +197 to +202
For instance, for a rectilinear source grid for which
up to 4 source grid cells contribute to each
destination grid cell, if *mtol* is in the range
``[0.5, 0.75)`` then a destination grid cell will in
general only be be masked if three or more of its
source grid cells are masked.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, very useful to have an example as its a bit difficult to envision the concept in practice otherwise.

Comment thread cf/docstring/docstring.py

A destination grid cell j will be masked if *mtol*
multiplied by the total number of source cells i for
which ``w_ji >= min_weight`` is greater then the

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
which ``w_ji >= min_weight`` is greater then the
which ``w_ji >= min_weight`` is greater than the

Comment thread cf/docstring/docstring.py
up to 4 source grid cells contribute to each
destination grid cell, if *mtol* is in the range
``[0.5, 0.75)`` then a destination grid cell will in
general only be be masked if three or more of its

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
general only be be masked if three or more of its
general only be masked if three or more of its

Comment thread cf/data/dask_regrid.py
up to 4 source grid cells contribute to each
destination grid cell, if *mtol* is in the range
``[0.5, 0.75)`` then a destination grid cell will in
general only be be masked if three or more of its

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
general only be be masked if three or more of its
general only be masked if three or more of its

Comment thread cf/regrid/regrid.py

A destination grid cell j will be masked if *mtol*
multiplied by the total number of source cells i for which
``w_ji >= min_weight`` is greater then the number of those

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
``w_ji >= min_weight`` is greater then the number of those
``w_ji >= min_weight`` is greater than the number of those

Comment thread cf/regrid/regrid.py
For instance, for a rectilinear source grid for which up
to 4 source grid cells contribute to each destination grid
cell, if *mtol* is in the range ``[0.5, 0.75)`` then a
destination grid cell will in general only be be masked if

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
destination grid cell will in general only be be masked if
destination grid cell will in general only be masked if

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request regridding Relating to regridding operations

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Enable linear regridding to ignore masked source cells and still produce unmasked destination cells

2 participants