Improve typing consistency in codebase by ubdbra001 · Pull Request #1311 · AFM-SPM/TopoStats

ubdbra001 · 2026-03-05T15:05:10Z

This PR addresses the inconsistent typing in the codebase.
Given the size of this I'm going to do it incrementally in this PR. The initial stage of this is picking out the "easy wins" e.g. cases where the type hints are missing or incorrect and are easy/obvious to update.

At this stage all the tests still pass.

Before submitting a Pull Request please check the following.

Existing tests pass.
Documentation has been updated and builds. Remember to update as required...
- docs/configuration.md
- docs/usage.md
- docs/data_dictionary.md
- docs/advanced.md and new pages it should link to.
Pre-commit checks pass.

resolves typing error for later "molecule_data[i] = ..." as NoneType is not sub-scriptable

Raises valueError if molecule_data is None

codecov · 2026-03-05T15:15:22Z

Codecov Report

❌ Patch coverage is 89.36170% with 5 lines in your changes missing coverage. Please review.
✅ Project coverage is 87.97%. Comparing base (b72a0c2) to head (9328755).
⚠️ Report is 515 commits behind head on main.

Files with missing lines	Patch %	Lines
topostats/processing.py	33.33%	4 Missing ⚠️
topostats/classes.py	83.33%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1311      +/-   ##
==========================================
- Coverage   89.25%   87.97%   -1.29%     
==========================================
  Files          30       31       +1     
  Lines        5810     6047     +237     
==========================================
+ Hits         5186     5320     +134     
- Misses        624      727     +103

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

ns-rse

Good work tackling this behemoth of a task @ubdbra001 and apologies for it being so messy (typing only came onto my radar way after I'd started).

Some comments in-line from a quick scan, don't have time to investigate more deeply I'm afraid.

ns-rse · 2026-03-05T17:07:47Z

        Returns
        -------
-        np.array
+        npt.ArrayLike


I've not come across npt.ArrayLike before so had to look it up.

In this instance the function return np.asarray(filtered_arr1) and so whilst it could be considered as something that could be converted to an array it is already an array so I wonder if npt.NDArray is more appropriate as we (should) know what the dtype should be although it's currently missing.

Disclaimer typing is not one of my strong points!

Here's a post about it that seems to opine that npt.NDArray is marginally better.

npt.ArrayLike seems to be for objects that can be cast to arrays, including arrays.

npt.NDArray sounds more sensible, and I suspect I've used it elsewhere, so I'll stick to that one. Thanks!

ns-rse · 2026-03-05T17:14:39Z

-        A dictionary with keys 'image', 'img_path' and 'pixel_to_nm_scaling' containing a file or frames' image, it's
-        path and it's pixel to namometre scaling value.
+    topostats_object : Topostats
+        TopoStats object - Needs further info


How about...

Suggested change

TopoStats object - Needs further info

An object of type ``TopoStats`` class.

That works, I was wondering if there should be a little more, e.g. run_curvature_stats has:

``TopoStats`` object post splining, all ``Molecules`` within the ``grain_crops`` attribute (a dictionary of ``GrainCrop`` should have ``splined_coords`` attributes populated.

Hmmm, typically these will be newly instantiated but there is nothing stopping an existing TopoStats (e.g. loaded from disk / .topostats file) object being passed in here

An object of type ``TopoStats`` class with a minimum of ``image_original``, ``filename`` and ``pixel_to_nm_scaling`` attributes which allow filtering to be run.

I think those are the bare minimum but could be wrong.

Okay, I'll add those, and we can update it in future if required.

ns-rse · 2026-03-05T17:17:22Z

            grain_stats_df.index.set_names(["grain_number", "class", "subgrain"], inplace=True)
        else:
            grain_stats_df = None
        return topostats_object.filename, topostats_object, grain_stats_df


Not for this work but I thought I'd removed returning of dataframes as they are instead pulled out of the topostats_objects and collated into a dictionary before converting to pd.DataFrame.

This was the only one I've spotted so far, happy to open an issue to be addressed later

Note it as an issue, its at least then recorded and can be addressed if anyone has time/inclination.

SylviaWhittle

Just a thing I noticed skim-reading this.

SylviaWhittle · 2026-03-05T18:06:28Z

    Returns
    -------
-    tuple[str, pd.DataFrame]
+    tuple[str, pd.DataFrame] - Deprecated, needs updating


Alas, Ty sees that filename property can be None despite (I think) a TopoStats object without filename should not be possible?

So currently, according to the types specified in the TopoStats class, filename is optional, and so it can be None: filename: str | None = None
Which is why Ty complains.

I've had a go at changing this already, but if I remember correctly it caused a bunch of tests to fail, and I didn't want to deal with that quite yet. That'll be the next round of updates (when I have got as much of the low handing fruit as possible)

Adds correct return typehint Co-authored-by: Neil Shephard <n.shephard@sheffield.ac.uk>

Add correct return type hint Co-authored-by: Sylvia Whittle <86117496+SylviaWhittle@users.noreply.github.com>

ubdbra001 · 2026-03-05T19:14:26Z

@@ -214,7 +214,7 @@ def compile_images(
    @staticmethod
    def remove_common_values(
        ordered_array: npt.NDArray, common_value_check_array: npt.NDArray, retain: list = ()


On this:

retain is typed as list type but being assigned an empty tuple by default 🤔

Setting the default value to an empty list may cause issues (see here).

I'll probably set it as None and add a check so that it get re-set to an empty list if the arg is not supplied.
Any objections?

ubdbra001 · 2026-03-05T19:17:11Z

+    molecule_data : dict[int, Molecule], optional
        Dictionary of ``Molecule`` objects indexed by molecule number.
-    tracing_stats : dict | None
+    tracing_stats : dict, optional


I have been going around replacing | None with , optional in the docstrings. Probably doesn't have a real impact but I thought I'd make it all consistent.

Let me know if you'd prefer to retain | None

ubdbra001 · 2026-03-05T19:17:46Z

            Dictionary, indexed by molecule where the value is the molecules statistics for the given molecule.
        """
+        if self.molecule_data is None:
+            raise ValueError("No molecule data found")


Let me know if you'd like me to change this to something else

Correct to a tuple of str and dict

Corrects the type hint and the docstring

ns-rse

Some responses in-line...

ns-rse · 2026-03-06T11:18:18Z

-        A dictionary with keys 'image', 'img_path' and 'pixel_to_nm_scaling' containing a file or frames' image, it's
-        path and it's pixel to namometre scaling value.
+    topostats_object : Topostats
+        TopoStats object - Needs further info


Hmmm, typically these will be newly instantiated but there is nothing stopping an existing TopoStats (e.g. loaded from disk / .topostats file) object being passed in here

An object of type ``TopoStats`` class with a minimum of ``image_original``, ``filename`` and ``pixel_to_nm_scaling`` attributes which allow filtering to be run.

I think those are the bare minimum but could be wrong.

ns-rse · 2026-03-06T11:19:06Z

            grain_stats_df.index.set_names(["grain_number", "class", "subgrain"], inplace=True)
        else:
            grain_stats_df = None
        return topostats_object.filename, topostats_object, grain_stats_df


Note it as an issue, its at least then recorded and can be addressed if anyone has time/inclination.

ns-rse · 2026-03-06T18:27:30Z

Just a thought but it might be worth adding commits to .git-blame-ignore-revs so that the "blame" resides with the original author rather than yourself @ubdbra001.

Docstring for second value in return tuple needs to be updated

These specify that the values in the dict are not None after processing

However, I don' see these being used anywhere, so it may be worth just deleting?

To match return value type hint

Currently very broad value type, I couldn't parse what they could be from the code

ubdbra001 · 2026-04-01T11:11:35Z

I think I've got all the low hanging typing fruit, anything else this will cause tests to fail and so I suspect will require a bit more discussion.

ubdbra001 added 13 commits March 5, 2026 14:05

update doctring typehints for MatchedBranch

e2df726

update type hints for UnMatchedBranch

c94b52b

set molecule_data outside then inside if-statement

cb5c73d

resolves typing error for later "molecule_data[i] = ..." as NoneType is not sub-scriptable

correct return type hint for remove_common_values

0ffb99e

updating OrderedTrace doctsrings

b11a57e

add error to collate_molecule_statistics

54a3ba8

Raises valueError if molecule_data is None

update str dunder for when images are None

2078f26

update type hints for GrainCrop image get & set

3db8286

update return type hint for calculate_region_connection_regions

d432d65

Set directories to Path types in process_scan

8481538

update param type and dir type in process_filters

6c7a8d9

update param type and dir type in process_grains

eff1c39

update return type and dir type in process_grainstats

9bd0003

ubdbra001 changed the title ~~I1299 improve typing consistency~~ Improve typing consistency in codebase Mar 5, 2026

[pre-commit.ci] Fixing issues with pre-commit

e6cbe72

ns-rse reviewed Mar 5, 2026

View reviewed changes

SylviaWhittle reviewed Mar 5, 2026

View reviewed changes

ubdbra001 and others added 3 commits March 5, 2026 18:49

Update run_nodestats_tracing docstring

bde9e43

Adds correct return typehint Co-authored-by: Neil Shephard <n.shephard@sheffield.ac.uk>

Update docstring for process_grainstats

d79829b

Add correct return type hint Co-authored-by: Sylvia Whittle <86117496+SylviaWhittle@users.noreply.github.com>

Update return description for process_grainstats docstring

063adba

ubdbra001 commented Mar 5, 2026

View reviewed changes

ubdbra001 added 2 commits March 5, 2026 19:54

Update return typehint for order_from_end

4c5670b

Correct to a tuple of str and dict

Update identify_writhes return type hints

bef3a39

Corrects the type hint and the docstring

ns-rse reviewed Mar 6, 2026

View reviewed changes

ns-rse mentioned this pull request Mar 12, 2026

[typing] Validate all calls using pydantic.validate_call() decorator #1314

Open

Update param type hints in Filters class and methods

97e9d94

pre-commit-ci bot and others added 9 commits March 17, 2026 12:49

[pre-commit.ci] Fixing issues with pre-commit

92b5205

update typehints in theme module

7f7094d

update return type hint for remove_scars

29547f8

Docstring for second value in return tuple needs to be updated

add inline typehints for image dict values

6c27815

These specify that the values in the dict are not None after processing

update typing for bound_padded_coordinates_to_image and check functions

24997f5

However, I don' see these being used anywhere, so it may be worth just deleting?

update type hints for threshold functions

20ef973

convert return val of triangle height to float

d1f69e3

To match return value type hint

update return type hints for min_max_feret and get_feret_from_mask

96fae95

Currently very broad value type, I couldn't parse what they could be from the code

[pre-commit.ci] Fixing issues with pre-commit

9328755

ubdbra001 marked this pull request as ready for review April 1, 2026 11:10

ubdbra001 requested review from SylviaWhittle, ns-rse and tobyallwood April 1, 2026 11:11

	TopoStats object - Needs further info
	An object of type ``TopoStats`` class.

Conversation

ubdbra001 commented Mar 5, 2026

Uh oh!

codecov bot commented Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

ns-rse left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SylviaWhittle left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ubdbra001 Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ns-rse left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ns-rse commented Mar 6, 2026

Uh oh!

ubdbra001 commented Apr 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

codecov bot commented Mar 5, 2026 •

edited

Loading

ubdbra001 Mar 5, 2026 •

edited

Loading