Martina by mzapponi · Pull Request #194 · esm-tools/pycmor

mzapponi · 2025-08-20T18:55:00Z

OpenIFS has the time dimension with a different name, so i added two lines when input files are read (gather_inputs.py) to rename the data time dimension in case it does not match the expected one.

pgierz

Hi @mzapponi,

great job, thanks a lot for this! I have two ideas that could make this more general, can you see if it works with what I have suggested below?

src/pymor/core/gather_inputs.py

src/pymor/core/pipeline.py

mzapponi · 2025-08-25T17:28:50Z

Hi @pgierz,
somehow as you suggested was not correctly detecting the time dimension (rule.get("time_dimname") was always returning None). I found a solution like this, I checked and it works:

if "time" not in mf_ds.dims:
    time_dim = [dim for dim in mf_ds.dims if 'time' in dim.lower()]
    if not time_dim:
        raise ValueError(f"Cannot detect time dimension in dataset dims: {mf_ds.dims}")
    mf_ds = mf_ds.rename({time_dim[0]: "time"})
return mf_ds

what do you think?

For the change in pipeline.py, it works!

siligam · 2025-08-29T10:18:30Z

I can suggest an alternative workflow in this case. we can use get_time_label function from pymor.core.time_utils to detect the time dimension in the dataset and then rename it to "time" if it is not already named "time". It can detect time dimension irrespective of how it is labeled like "T" or "Time" or "_time".

To standardize the time dimension label, user can create a custom function called relabel_time_dimension_if_required and use it as one of the pipeline processing steps.

import xarray as xr
from pymor.core.time_utils import get_time_label

def relabel_time_dimension_if_required(
    obj: xr.Dataset | xr.DataArray,
    target: str = "time",
    raise_on_conflict: bool = True,
) -> xr.Dataset | xr.DataArray:
    """Renames the time dimension to a target label, if required.

    This function detects the time dimension using `get_time_label`. If the
    detected label differs from the `target`, it returns a new object with the
    dimension renamed. Otherwise, it returns the original object.

    Parameters
    ----------
    obj : xr.Dataset or xr.DataArray
        The object to process.
    target : str, optional
        The desired name for the time dimension, by default "time".
    raise_on_conflict : bool, optional
        If True, raise a ValueError if the target name already exists but is
        not the detected time dimension. If False, the function will be a no-op
        in case of a conflict, returning the original object.

    Returns
    -------
    xr.Dataset or xr.DataArray
        A new object with the time dimension renamed, or the original object
        if no changes were necessary.

    Raises
    ------
    ValueError
        If `raise_on_conflict` is True and a naming conflict occurs.
    """
    current = get_time_label(obj)

    # No-op condition 1: No time dim or it already has the target name.
    if current is None or current == target:
        return obj

    # No-op condition 2: Collision detected and raise_on_conflict is False.
    if (target in obj.dims or target in obj.coords) and target != current:
        if raise_on_conflict:
            raise ValueError(
                f"Target '{target}' exists and differs from detected time dim '{current}'."
            )
        return obj

    # A rename is required; create a new object.
    out = obj.rename({current: target})

    # Preserve attrs/encoding and set CF-ish hints.
    if target in out.coords:
        src_coord = obj.coords.get(current)
        tcoord = out.coords[target]
        attrs = dict(getattr(src_coord, "attrs", {}))
        attrs.setdefault("standard_name", "time")
        attrs.setdefault("axis", "T")
        tcoord.attrs = attrs
        if src_coord is not None and hasattr(src_coord, "encoding"):
            tcoord.encoding = getattr(src_coord, "encoding", {}).copy()

    return out

pgierz · 2025-08-29T11:35:05Z

Hey @mzapponi, can you try out @siligam's suggestion using a script:// step in your pipeline? You'll probably need to update pymor, and if you get stuck about what I mean, I can help you.

We are also curious in the developer team: did you learn how to use Pymor mostly from the workshop (videos, reading, etc), or from the handbook (which still needs improvement...)

pgierz · 2025-08-29T13:29:10Z

So, you would need something like this:

... # From pavan

def use_time_renamer(data, rule):
    new_data = relabel_time_dimension_if_required(data, rule)
    return new_data

mzapponi · 2025-08-29T14:51:34Z

hi @siligam, i get this error (I could not locate the get_time_label function at the beginning and for me it is in pymor.std_lib.dataset_helpers rather than in pymor.core.time_utils, i guess is the same you are referring to):

16:37:02.631 | INFO | Task run 'load_mfdataset-f55' - Finished in state Completed()
16:37:02.669 | ERROR | Task run 'use_time_renamer-24a' - Task run failed with exception: NotImplementedError("'item' is not yet a valid method on dask arrays") - No retries configured for this task.
Traceback (most recent call last):
File "/home/a/a270210/.conda/envs/pymor/lib/python3.10/site-packages/xarray/computation/ops.py", line 195, in _call_possibly_missing_method
method = getattr(arg, name)
AttributeError: 'Array' object has no attribute 'item'

PS yes, i looked at some of the videos/slides of the workshop and then i just started making some tests to figure out how the tool would work, i think it is pretty intuitive to be honest, but i didn't know about the handbook!

pgierz · 2025-11-27T07:47:43Z

I'm pretty sure this one can be closed without merging, but I am not sure. @mzapponi and @siligam, can you give me an update here?

Co-authored-by: Paul Gierz <pgierz@awi.de>

pgierz and others added 2 commits August 20, 2025 10:56

fix for missin cluster attribute

eb97e1c

modify time dimension name if not matching

c662b3d

mzapponi requested a review from pgierz August 20, 2025 18:55

mzapponi assigned pgierz Aug 20, 2025

pgierz requested changes Aug 21, 2025

View reviewed changes

src/pymor/core/gather_inputs.py Outdated Show resolved Hide resolved

src/pymor/core/pipeline.py Outdated Show resolved Hide resolved

pgierz added the question Further information is requested label Nov 27, 2025

siligam and others added 2 commits November 27, 2025 10:41

Update src/pymor/core/gather_inputs.py

19fd536

Co-authored-by: Paul Gierz <pgierz@awi.de>

Update src/pymor/core/pipeline.py

9f3ff53

Co-authored-by: Paul Gierz <pgierz@awi.de>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Martina#194

Martina#194
mzapponi wants to merge 4 commits intomainfrom
martina

mzapponi commented Aug 20, 2025

Uh oh!

pgierz left a comment

Uh oh!

Uh oh!

Uh oh!

mzapponi commented Aug 25, 2025 •

edited

Loading

Uh oh!

siligam commented Aug 29, 2025

Uh oh!

pgierz commented Aug 29, 2025

Uh oh!

pgierz commented Aug 29, 2025

Uh oh!

mzapponi commented Aug 29, 2025

Uh oh!

pgierz commented Nov 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

mzapponi commented Aug 20, 2025

Uh oh!

pgierz left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

mzapponi commented Aug 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

siligam commented Aug 29, 2025

Uh oh!

pgierz commented Aug 29, 2025

Uh oh!

pgierz commented Aug 29, 2025

Uh oh!

mzapponi commented Aug 29, 2025

Uh oh!

pgierz commented Nov 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

mzapponi commented Aug 25, 2025 •

edited

Loading