feat(evaluation): Add Pareto-Optimal Evaluation#213
feat(evaluation): Add Pareto-Optimal Evaluation#213ankitlade12 wants to merge 6 commits intoNixtla:mainfrom
Conversation
4328af7 to
00dbac1
Compare
nasaul
left a comment
There was a problem hiding this comment.
The core Pareto logic looks algorithmically sound, but the PR needs some changes and add tests for ParetoFrontier
- Add evaluate and ParetoFrontier to __init__.py __all__ - Use narwhals native DataFrame filtering in ParetoFrontier - Properly extract evaluate() model columns in plot_pareto_2d - Add tests for ParetoFrontier
|
Thanks for the updates — the core Pareto dominance algorithm is correct and the narwhals-based approach in Bugs1. In the pareto_sorted = pareto_df.sort_values(metric_x) # pandas only
ax.scatter(pareto_df[metric_x], ...) # pandas-only indexing
for _, row in plot_df.iterrows(): # pandas onlyThe method signature accepts 2. plot_df["model"] = plot_df.index.astype(str)This modifies the passed-in DataFrame in place when it's pandas. Use Design Issues3.
4. Confusing In the else:
models = metrics # misleading: metrics is used as model names hereA user will naturally pass metric names like Minor
|
Description
This pull request introduces multi-objective evaluation capabilities to
utilsforecast. It adds a robustParetoFrontierclass directly withinevaluation.py, providing a model-agnostic and dataframe-agnostic way to identify the best-performing models across conflicting metrics (e.g., minimizing RMSE while minimizing MAE, or minimizing latency while maximizing accuracy).Since
utilsforecastacts as the foundational evaluation layer for Nixtla's ecosystem, integrating Pareto selection natively enables downstream libraries (likemlforecastandstatsforecast) to leverage model multi-objective benchmarking out-of-the-box using the standard output ofevaluate().Key Changes
ParetoFrontierclass inevaluation.py: Includesis_dominatedmathematically validated bounding logic and exposedfind_non_dominatedroutines.plot_pareto_2d()to visually inspect the trade-off frontier. Matplotlib is lazily imported and handled gracefully with an explicitwarnings.warnif missing, addressing reviewer feedback to avoid rawprintstatements.AnyDFType): Ensurespandasandpolarsarrays are passed cleanly through the mathematical logic, addressing previous maintainer concerns surrounding hardpandasdependencies.__init__.py: IntegratedevaluateandParetoFrontierinto__all__for easy top-level access (from utilsforecast import ParetoFrontier).Example Usage