Skip to content

Operation "max_date" doesn't work when Group is specified #1308

@HajimeShimizu

Description

@HajimeShimizu

CORE version: 0.12.0

When group is specified, max_date operation fires error. Here is example.

Operations:
  - id: $max_ds_date
    operator: max_date
    domain: DS
    name: DSSTDTC
    group:
      - USUBJID

Error is found in pandas module.
TypeError: agg function failed [how->max, dtype->object]

I checked "cdisc_rules_engine/operations/max_date.py", and I see that expected data conversion (convert to datetime) is not done when grouping information is there.

        if not self.params.grouping:
            data = pd.to_datetime(self.params.dataframe[self.params.target])  <-- data is converted to datetime
            max_date = data.max()
            if isinstance(max_date, pd._libs.tslibs.nattype.NaTType):
                result = ""
            else:
                result = max_date.isoformat()
        else:
            result = self.params.dataframe.groupby(self.params.grouping).max() <-- no conversion is found

I tried the following update, and as far as I see, it works.

        if not self.params.grouping:
            data = pd.to_datetime(self.params.dataframe[self.params.target])
            max_date = data.max()
            if isinstance(max_date, pd._libs.tslibs.nattype.NaTType):
                result = ""
            else:
                result = max_date.isoformat()
        else:
            #result = self.params.dataframe.groupby(self.params.grouping).max()
            
            # suggested code
            df = self.params.dataframe.copy()
            df[self.params.target] = pd.to_datetime(df[self.params.target])
            result = df.groupby(self.params.grouping)[self.params.target].max()
            result = result.reset_index()

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions