Skip to content

[FEAT] Add Horizon for fitted values#586

Merged
nasaul merged 11 commits intoNixtla:mainfrom
janrth:feature/multi-step_training_prediction
Apr 1, 2026
Merged

[FEAT] Add Horizon for fitted values#586
nasaul merged 11 commits intoNixtla:mainfrom
janrth:feature/multi-step_training_prediction

Conversation

@janrth
Copy link
Copy Markdown
Contributor

@janrth janrth commented Mar 7, 2026

Allows users to set horizon for fitted values with param h. Standard is still h=1, but can be changed:

fitted_h1 = fcst.forecast_fitted_values(h=1)
fitted_h12 = fcst.forecast_fitted_values(h=12)

Works as expected.

Solves #346
horizon_fitted

Checklist:

  • This PR has a meaningful title and a clear description.
  • The tests pass.
  • All linting tasks pass.
  • The notebooks are clean.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 914d2cef46

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread mlforecast/forecast.py
Comment thread mlforecast/forecast.py
@codspeed-hq
Copy link
Copy Markdown

codspeed-hq bot commented Mar 7, 2026

Merging this PR will not alter performance

✅ 12 untouched benchmarks


Comparing janrth:feature/multi-step_training_prediction (ea34e6c) with main (ed97ad0)

Open in CodSpeed

@janrth
Copy link
Copy Markdown
Contributor Author

janrth commented Mar 7, 2026

@codex

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 33b2b1c8cf

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread mlforecast/forecast.py Outdated
@nasaul nasaul linked an issue Mar 9, 2026 that may be closed by this pull request
@nasaul nasaul changed the title add horizon for fitted values [FEAT] Add Horizon for fitted values Mar 9, 2026
@nasaul
Copy link
Copy Markdown
Contributor

nasaul commented Mar 9, 2026

Thanks for the contribution — this addresses a long-standing gap (issue #346) and the direct model path (max_horizon) is clean and correct. A few things worth addressing before merging:


Major

1. Quadratic complexity in _compute_recursive_fitted_values_on_demand

The loop calls self.predict(new_df=hist[:t]) for each valid origin per series. Each call creates a new TimeSeries, calls _fit() on it (O(t) work), then runs an h-step rollout. Summed across T origins this is O(T²) — severely slow for real datasets. The warning says "can be slow" but understates this significantly.

Suggested fix — vectorized batch rollout (Option 1):

Since global/group transforms are already rejected, lag updates are simple index arithmetic. The idea is to preprocess once and do the h-step rollout in numpy across all origins simultaneously:

1. Preprocess full training data once → feature matrix X of shape (T, n_features)
   Keep actual targets y of shape (T,)

2. For step s = 1..h (the autoregressive rollout):
   a. Batch predict: ŷ_s = model.predict(X)   # O(T), one model call per step total
   b. Update X for next step:
      For each lag k:
        - If k < s:  replace lag_k column with ŷ_{s-k}  (use predicted value)
        - If k >= s: lag_k stays as actual y             (already in X)

3. ŷ_h[t] is the h-step-ahead fitted value for origin t

Complexity: O(h · T · n_lags) — linear in T, and only h model calls total (not T×h). This aligns with how _compute_fitted_values already works (batch over a feature matrix). The update in step 2b can be a small _update_lag_features(X, preds, step) helper — for lags-only it is ~10 lines of numpy index manipulation.


2. Fragile self.ts state mutation in the loop

original_ts = self.ts
for uid, group in train_pd.groupby(...):
    for target_idx in ...:
        try:
            preds = self.predict(h=h, new_df=hist, X_df=X_df)
        finally:
            self.ts = original_ts

self.predict(new_df=...) mutates self.ts as a side effect, requiring manual restoration via finally. The finally restores the original reference, but if self.ts was mutated in-place (not replaced) before the exception, the restored object may be in a dirty state. The batch rollout approach in point 1 would eliminate this pattern entirely.


3. Inconsistent h column presence

For recursive h=1 there is no h column. For recursive h>1 and direct models there is. The test even asserts this explicitly:

assert "h" not in fitted_h1.columns
assert "h" in fitted_h3.columns

This makes the return type unpredictable for users and complicates downstream code. The h column should always be present (or consistently absent, with the value known from the argument).


Minor

4. Redundant validation in _compute_recursive_fitted_values_on_demand

The private method checks h <= 1 and global/group transforms at the top, but forecast_fitted_values already validates both before calling it — making those guards unreachable. Either remove them from the private method (since it is internal) or remove the duplicate check from forecast_fitted_values.

5. Shallow copy of _fitted_train_df_

self._fitted_train_df_ = ufp.copy_if_pandas(df, deep=False)

A shallow copy means if the caller mutates column values in-place after fit(), the cached training data is silently affected. Since this cache is used for on-demand computation later, a deep copy (or at minimum a docstring/comment noting the limitation) would be safer.

6. Weak assertion in test_recursive_forecast_fitted_values_on_demand_h

restored = fitted_h3.merge(df[["unique_id", "ds", "y"]], on=["unique_id", "ds"], suffixes=("_fit", "_orig"))
np.testing.assert_allclose(restored["y_fit"].values, restored["y_orig"].values)

y_fit here is the target column copied from the training data, not the model's prediction — this test only checks that actual target values were joined correctly, not that predictions are reasonable. The test should validate the model output column (e.g., LinearRegression) against something meaningful (e.g., that predictions are finite, or that h=3 predictions differ from h=1 predictions).

7. PR checklist is incomplete

"The tests pass", "All linting tasks pass", "The notebooks are clean" are all unchecked. Please confirm these before merging.


Positive notes

  • The direct model path (max_horizon) is correct and clean.
  • Making h keyword-only (*) is good API design for future-proofing.
  • The polars↔pandas bridging in forecast_fitted_values is handled correctly.
  • The new tests cover the right scenarios: recursive, direct, error cases, and positional compatibility.

Copy link
Copy Markdown
Contributor

@nasaul nasaul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review — Round 2

Great progress on the previous round of comments — all 6 items have been addressed. Two blocking issues remain before this can merge.


Bug — Static features crash forecast_fitted_values(h>1) [blocking]

Location: mlforecast/forecast.py:617–636

_compute_recursive_fitted_values_on_demand strips static columns from hist before calling temp_ts._fit(), but then passes static_features=self.ts.static_features to _fit. Inside TimeSeries._fit, it tries to extract the static columns from hist, which no longer has them, causing:

ValueError: Feature names seen at fit time, yet now missing: static_0

This affects any user who passes static features to fit() — including the default static_features=None which auto-detects all non-time/non-target columns as static.

The failing test tests/test_forecast.py::test_recursive_forecast_fitted_values_on_demand_h_with_static_features already covers this path and currently reproduces the crash.

Recommended fix: tell _fit there are no static columns to extract, then copy the already-computed static_features_ from the original TimeSeries:

temp_ts._fit(hist, ..., static_features=[id_col], ...)
temp_ts.static_features_ = self.ts.static_features_

This works correctly because static features are still used during prediction. In TimeSeries.predict, the feature matrix is built by horizontally concatenating static_features_ with the lag/date features (see core.py:981). By copying self.ts.static_features_ after _fit, the correct static values — which are constants per series by definition — are present when temp_ts.predict(models=self.models_, ...) is called, so the models see exactly the same feature matrix they were trained on. The only thing we skip is redundantly re-extracting them from hist, which is what was crashing.


_fitted_train_df_ doubles memory footprint [blocking]

A deep copy of the full training DataFrame is stored in self._fitted_train_df_ indefinitely after fit(fitted=True). For large datasets this permanently doubles the memory footprint of the fitted object.

Required fix: accept the training data as an optional parameter to forecast_fitted_values() so users can opt out of the storage cost entirely:

def forecast_fitted_values(self, h=1, *, train_df=None): ...

When train_df is provided, use it directly without storing a copy on self. When it is None, fall back to the cached _fitted_train_df_ for convenience. This keeps the current zero-friction API for small datasets while giving memory-constrained users a way to avoid the overhead entirely. The fit and forecast_fitted_values docstrings should document this trade-off.


Nit — Document why the per-origin loop was chosen over full vectorization

In the previous review I suggested a fully vectorized approach: preprocess all T origins into a single feature matrix of shape (T, n_features) and then do exactly h batched model calls updating lag columns in-place, reducing model evaluations from O(T²) down to O(h) regardless of dataset size. The original implementation was O(T²); the current one improves this to O(T·h) — one temp_ts.predict(horizon=h) per origin, each doing a full h-step autoregressive rollout — but stops short of the fully vectorized O(h) path.

The current approach may be intentional, but this reasoning isn't captured anywhere. Please add a comment above _compute_recursive_fitted_values_on_demand explaining the trade-off so future maintainers understand why the vectorized path was not taken.


Nit — Silent no-op in direct model path

In forecast_fitted_values (forecast.py:874–886), the h argument is silently ignored if "h" not in res.columns. An assert "h" in res.columns would close this silent failure path.


Summary: Two blockers before merge: fix the static features bug (test already failing) and add a train_df parameter to forecast_fitted_values() to avoid the memory doubling. All previous comments have been resolved.

@janrth
Copy link
Copy Markdown
Contributor Author

janrth commented Mar 28, 2026

@codex

@janrth
Copy link
Copy Markdown
Contributor Author

janrth commented Mar 28, 2026

We kept the current recursive execution model intentionally. MLForecast already vectorizes recursive forecasting across series at each forecast step, but remains iterative across horizon steps. A fully origin-batched fitted-values implementation would be a separate rollout engine that batches across training origins as an additional axis.

That is a materially larger algorithmic change than this PR, because it would need to reproduce the exact current recursive semantics for lag updates, lag transforms, date features, static features, dynamic exogenous alignment, and target transforms. For this PR we prioritized correctness and parity with the existing predict path, and documented that trade-off rather than introducing a second, more specialized engine only for multi-step fitted values.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

on=[model.ts.id_col, model.ts.time_col],

P1 Badge Join AutoML fitted outputs using horizon key

MLForecast.forecast_fitted_values now returns an h column, but AutoMLForecast.forecast_fitted_values still merges per-model frames only on id/time. With multiple models this creates duplicated/suffixed horizon columns (h_x/h_y or similar), so the combined output no longer has a single reliable horizon column and can break downstream consumers that expect one h field per row.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread mlforecast/forecast.py Outdated
@janrth
Copy link
Copy Markdown
Contributor Author

janrth commented Mar 28, 2026

@codex

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2326ef6769

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread mlforecast/forecast.py
Comment thread mlforecast/forecast.py
Copy link
Copy Markdown
Contributor

@nasaul nasaul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work Jan, just added better docstring.

@nasaul nasaul merged commit cb8b558 into Nixtla:main Apr 1, 2026
21 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Multi-Step Training Predictions

2 participants