Skip to content

Implement get_bound_only_mask function to the HaloCatalogues class.#80

Open
diegod-01 wants to merge 2 commits intoSWIFTSIM:mainfrom
diegod-01:main
Open

Implement get_bound_only_mask function to the HaloCatalogues class.#80
diegod-01 wants to merge 2 commits intoSWIFTSIM:mainfrom
diegod-01:main

Conversation

@diegod-01
Copy link

This PR adds the "get_bound_only_mask" to the HaloCatalogues class, which evaluates a bound-only mask without applying it and returns a mask collection aligned with the current particle selections. The returned MaskCollection respects any existing extra masks and can be combined with previously applied masks while preserving the correct selection behaviour across all halo catalogue classes. The implementation has been validated by checking that applying bound-only after no mask is equivalent to using bound-only from the start, and that applying a non-trivial mask (e.g. slab or sphere) before bound-only still produces the correct result. To support this, I introduced a couple of helper functions (currently defined within get_bound_only_mask) to handle mask combination and validation. These could serve as a starting point for a future _combine_masks helper, but I kept them local for now pending feedback on the approach.

Copy link
Member

@kyleaoman kyleaoman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Diego, here's an initial batch of feedback. I'll have to take another look once you've worked through it - particularly the tests need a significant structural overhaul (probably mostly using existing or slightly generalised fixtures instead of the helpers you've written). Some of the helpers that you've defined as nested functions should also move, but you can leave those for now, that's easy to do at a later stage.

I know many of these comments will seem quite picky about very specific details - choice of variable names and so on is important in collaborative code, though, as it's functionally part of the documentation ("the code should say what it does" as much as possible). Any questions just ask, and thanks for working on this!

You can (should!) "resolve conversation" as you work through things, else it all becomes a bit hard to follow.


def get_bound_only_mask(self, sg: "SWIFTGalaxy") -> MaskCollection:
"""
Evaluate a bound-only mask without applying it, returning the masks.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Evaluate a bound-only mask without applying it, returning the masks.
Evaluate a mask that retains only particles labelled as "bound" by the halo catalogue.

-------
:class:`~swiftgalaxy.masks.MaskCollection`
A bound-only mask aligned to the current particle selection.
"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would make sense to have a See Also section pointing to :meth:~swiftsimio.reader.SWIFTGalaxy.mask_particles, and an Examples section with a (almost trivial) example of creating then applying the mask. Perhaps two examples, one with mask_particles and one with square brackets used to apply the mask.

"""
Evaluate a bound-only mask without applying it, returning the masks.

The returned :class:`~swiftgalaxy.masks.MaskCollection` contains masks
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, not sure this quite conveys what's going on. A possible suggestion: 'The returned MaskCollection can be applied to the SWIFTGalaxy object to remove any unbound particles. Creating a SWIFTGalaxy with no mask then evaluating and applying this mask is equivalent to creating the same SWIFTGalaxy with extra_mask="bound_only".' Perhaps also worth mentioning that each HaloCatalogue has its own definition of what 'bound' means?

}
)

def get_bound_only_mask(self, sg: "SWIFTGalaxy") -> MaskCollection:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def get_bound_only_mask(self, sg: "SWIFTGalaxy") -> MaskCollection:
def get_bound_only_mask(self) -> MaskCollection:

Feels like the user shouldn't need to give the SWIFTGalaxy as a parameter. I think the method should be bound to the SWIFTGalaxy class. Then any properties of the HaloCatalogue are available as self.halo_catalogue.<...>. As a user I think that it feels more natural to write sg.get_bound_only_mask() than sg.halo_catalogue.get_bound_only_mask(sg), and I don't see any obvious technical barrier.

:class:`~swiftgalaxy.masks.MaskCollection`
A bound-only mask aligned to the current particle selection.
"""

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These functions should eventually move outside of this function but they're fine here for now.

for particle_name in sg.metadata.present_group_names:
new_mask = getattr(bound_only_mask, particle_name)
if new_mask is None:
continue
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
continue
applied_masks[particle_name] = None
continue

Just a slight preference to be explicit here.

Comment on lines +392 to +404
particle_metadata = getattr(
sg.metadata, f"{particle_name}_properties"
)
if sg._spatial_mask is None:
num_part = getattr(sg.metadata, f"n_{particle_metadata.group_name}")
else:
num_part = int(
np.sum(
sg._spatial_mask.get_masked_counts_offsets()[0][
particle_name
]
)
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is duplicated from some other masking function, right? Should thus get split off into a helper function. You can refactor the original location to call the new helper.

return LazyMask(mask_function=_combined_mask)

bound_only_mask = self._generate_bound_only_mask(sg, mask_loaded=False)
applied_masks: dict[str, LazyMask] = {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since they're not applied, how about bound_only_masks as a name?

reltol_nd = 1.0e-4


def _clone_catalogue(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not the way to achieve what you're trying to do. Slightly better would be to implement __copy__ on the HaloCatalogue and inherited classes then make some kind of use of that. But much better is to use a pattern something like this in a pytest fixture (you can edit conftest.py as needed). https://docs.pytest.org/en/stable/how-to/fixtures.html#factories-as-fixtures

return hf.__class__(**init_args)


def _prepare_snapshot_for_hf(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again this should be part of a fixture. I think that you want something close to the sg_hf fixture. If you need to generalize you can move things to helpers. You could look at the extra_part_type and write_extra_part_type fixtures in https://github.com/SWIFTSIM/swiftsimio/blob/master/tests/conftest.py for some general inspiration for how this kind of pattern can be structured. If you're not familiar with yield you need to get familiar with it for this situation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants