Skip to content

TC hunt vol. 4#871

Open
mariusaurus wants to merge 127 commits into
NVIDIA:mainfrom
mariusaurus:mkoch/tc_hunt_4
Open

TC hunt vol. 4#871
mariusaurus wants to merge 127 commits into
NVIDIA:mainfrom
mariusaurus:mkoch/tc_hunt_4

Conversation

@mariusaurus
Copy link
Copy Markdown
Collaborator

Earth2Studio Pull Request

Description

This PR adds some plotting functionality for the TC pipeline.
Main components are two notebooks (as python cells with instructions how to convert) that produce field plots with tracks and comparison of track data with reference tracks.

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.
  • The CHANGELOG.md is up to date with these changes.
  • An issue is linked to this pull request.
  • Assess and address Greptile feedback (AI code review bot for guidance; use discretion, addressing all feedback is not required).

Dependencies

mariusaurus and others added 30 commits December 1, 2025 07:52
@mariusaurus mariusaurus requested a review from NickGeneva May 20, 2026 15:36
@mariusaurus mariusaurus self-assigned this May 20, 2026
@mariusaurus mariusaurus marked this pull request as ready for review May 22, 2026 12:33
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 22, 2026

Greptile Summary

This PR delivers the TC tracking visualisation tooling that was previously marked "coming soon": two JupyText notebook scripts (tracks_slayground_notebook.py and plot_tracks_n_fields_notebook.py) plus three supporting library modules (data_handling.py, plotting_helpers.py, analyse_n_plot.py). It also widens great_circle_distance in tc_hunt_utils.py to accept array inputs and exports EARTH_RADIUS_M as a shared constant.

  • data_handling.py contains two logic issues: remove_trailing_nans crashes with IndexError when a merged track has no non-NaN rows, and rebase_by_lead_time hardcodes \"msl\" as the trim variable regardless of the variables argument.
  • plot_tracks_n_fields_notebook.py uses xr.Dataset.sel with a floating-point step for spatial sub-selection, which is fragile for scale > 1.
  • analyse_n_plot.py will raise NameError if called with an empty cases list because err_dict is only assigned inside the loop.

Confidence Score: 3/5

The new library modules work for the documented happy path but have unchecked crash paths reachable in ordinary use.

Two crash paths in data_handling.py are reachable in ordinary use: remove_trailing_nans raises IndexError on any track with no valid overlapping timesteps, and rebase_by_lead_time hardcodes msl as the trim variable, breaking when callers omit it from variables.

data_handling.py needs the most attention; plot_tracks_n_fields_notebook.py and analyse_n_plot.py each have one smaller issue worth fixing before wide use.

Important Files Changed

Filename Overview
recipes/tc_tracking/plotting/data_handling.py New library module for track ingestion and error computation; contains a crashable IndexError path and a hardcoded variable assumption in the lead-time rebasing logic.
recipes/tc_tracking/plotting/analyse_n_plot.py Batch analysis entry point; well-structured with a latent NameError if an empty case list is supplied.
recipes/tc_tracking/plotting/plotting_helpers.py New plotting helper library; mostly clean.
recipes/tc_tracking/plotting/plot_tracks_n_fields_notebook.py New JupyText notebook for animated field+track plots; uses xr.Dataset.sel for coarsening which is fragile for scale > 1.
recipes/tc_tracking/plotting/tracks_slayground_notebook.py New JupyText notebook for ensemble track analysis; no significant issues.
recipes/tc_tracking/src/tc_hunt_utils.py Exports EARTH_RADIUS_M constant and widens great_circle_distance to accept array inputs; clean, backward-compatible change.

Comments Outside Diff (4)

  1. recipes/tc_tracking/plotting/data_handling.py, line 909-914 (link)

    P1 IndexError when track has no valid overlapping data

    np.where(~either_nans)[0][-1] raises IndexError when every row has a NaN in at least one of var or var_tru — e.g. when a predicted track has no temporal overlap with the reference track after the left-join. This can realistically occur for short-lived predicted tracks or when the true track CSV is missing msl values for early timesteps.

  2. recipes/tc_tracking/plotting/data_handling.py, line 946 (link)

    P1 Hardcoded "msl" in rebase_by_lead_time breaks for custom variable lists

    remove_trailing_nans(merged_track, "msl") assumes "msl" is always a column in both the predicted track and tru_track, regardless of the variables argument passed by the caller. If a user calls compute_averages_of_errors_over_lead_time with variables=["dist"] (no "msl"), the merged frame won't have a "msl" column and a KeyError is raised. The trimming variable should be derived from variables instead of hard-wired.

  3. recipes/tc_tracking/plotting/plot_tracks_n_fields_notebook.py, line 1358-1362 (link)

    P2 xr.Dataset.sel breaks for scale > 1 due to floating-point step mismatch

    ds.sel(lat=list(np.arange(lat_min, lat_max, scale * 0.25)), ...) selects coordinates by exact label. For scale=1 the 0.25-degree step matches the grid, but np.arange with a float step accumulates floating-point rounding errors, and for larger scale values the step may not align with dataset coordinate values. Using ds.sel(..., method="nearest") or ds.isel(lat=slice(None, None, scale), lon=slice(None, None, scale)) after sub-setting the region would be more robust.

  4. recipes/tc_tracking/plotting/analyse_n_plot.py, line 504-506 (link)

    P2 NameError when cases is an empty list

    If cases=[] is passed, the loop body never runs and err_dict is never assigned. The subsequent storm_metrics["var"] = list(err_dict.keys()) line raises NameError: name 'err_dict' is not defined. A guard or pre-loop initialisation would prevent the confusing crash.

Reviews (1): Last reviewed commit: "Merge branch 'main' into mkoch/tc_hunt_4" | Re-trigger Greptile

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants