Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
493eba5
Feat: add multi-quantile support for CatBoost
daidahao Jan 30, 2026
6eba2ea
Update docstring
daidahao Jan 30, 2026
fee446b
Fix typing errors
daidahao Jan 30, 2026
c4c4a6a
Fix doc rendering
daidahao Jan 30, 2026
ea3fc6e
Fix typing errors
daidahao Feb 27, 2026
3c188a0
Fix typos
daidahao Feb 27, 2026
4a46b81
Update todo note
daidahao Feb 27, 2026
28cbe39
Fix `TestSKLearnModels` tests
daidahao Feb 27, 2026
b353005
Revert `CatBoostClassifierModel` changes
daidahao Feb 27, 2026
40fb093
Remvoe freq warnings
daidahao Feb 27, 2026
1d8840f
Add CHANGELOG entry
daidahao Feb 27, 2026
3001662
Update `get_estimator` docstring
daidahao Feb 27, 2026
51b32a7
Add `get_estimator` warning for multi-quantile
daidahao Feb 27, 2026
76aaba4
Update docstring and notebook notes
daidahao Feb 27, 2026
ddc2cef
Update notebook note
daidahao Feb 27, 2026
310f514
Handle single quantile
daidahao Feb 27, 2026
989cdca
Update CHANGELOG
daidahao Feb 28, 2026
7f21c20
Merge branch 'master' into feature/catboost-multiquantile
daidahao Mar 7, 2026
f349d4b
Add MultiQuantileRegression and update CB logic
daidahao Mar 20, 2026
ad11d6b
Add stride to fit()
daidahao Mar 20, 2026
15c03a5
Fix a multiquantile bug and add unit tests
daidahao Mar 20, 2026
24c1fa8
Update catboost docs
daidahao Mar 20, 2026
95af1b7
Update quantiles for all sklearn models
daidahao Mar 20, 2026
dea5a48
Update CHANGELOG
daidahao Mar 20, 2026
2eedeaf
Add prob & likelihood tests
daidahao Mar 20, 2026
a41cf5c
Add model construction test
daidahao Mar 20, 2026
884b86b
Add test_get_estimator_multiquantile
daidahao Mar 20, 2026
49d5e46
Update notebook
daidahao Mar 20, 2026
60c93f4
Improve code cov.
daidahao Mar 20, 2026
15db0ed
Merge branch 'master' into feature/catboost-multiquantile
daidahao Mar 20, 2026
b61376f
Merge branch 'master' into feature/catboost-multiquantile
daidahao Mar 22, 2026
a6928d9
Merge branch 'master' into feature/catboost-multiquantile
daidahao Mar 25, 2026
286c91c
Update CHANGELOG after merge
daidahao Mar 25, 2026
8d32abc
Merge branch 'master' into feature/catboost-multiquantile
daidahao Mar 26, 2026
8bfb8d1
minor updates
dennisbader Mar 27, 2026
587da40
fix typos in notebook
dennisbader Mar 27, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,8 @@ but cannot always guarantee backwards compatibility. Changes that may **break co

### For users of the library:

- Added native multi-quantile support for `CatBoostModel` by using CatBoost’s `MultiQuantile` loss for faster training and inference. Set `likelihood="multiquantile"` to enable this feature. [#3032](https://github.com/unit8co/darts/pull/3032) by [Zhihao Dai](https://github.com/daidahao)

### For developers of the library:

## [0.43.0](https://github.com/unit8co/darts/tree/0.43.0) (2026-03-23)
Expand Down
54 changes: 37 additions & 17 deletions darts/models/forecasting/catboost_model.py
Comment thread
daidahao marked this conversation as resolved.
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,11 @@
)
from darts.typing import TimeSeriesLike
from darts.utils.likelihood_models.base import LikelihoodType
from darts.utils.likelihood_models.sklearn import QuantileRegression, _get_likelihood
from darts.utils.likelihood_models.sklearn import (
MultiQuantileRegression,
QuantileRegression,
_get_likelihood,
)

logger = get_logger(__name__)

Expand Down Expand Up @@ -143,19 +147,20 @@ def encode_year(idx):
To enable past and / or future encodings for any `SKLearnModel`, you must also define the
corresponding covariates lags with `lags_past_covariates` and / or `lags_future_covariates`.
likelihood
Can be set to 'quantile', 'poisson' or 'gaussian'. If set, the model will be probabilistic,
allowing sampling at prediction time. When set to 'gaussian', the model will use CatBoost's
'RMSEWithUncertainty' loss function. When using this loss function, CatBoost returns a mean
and variance couple, which capture data (aleatoric) uncertainty.
This will overwrite any `objective` parameter.
One of ``"multiquantile"``, ``"quantile"``, ``"poisson"``, or ``"gaussian"``. If set, the model
becomes probabilistic and supports sampling at prediction time. ``"multiquantile"`` uses CatBoost's
``"MultiQuantile"`` loss, and ``"gaussian"`` uses ``"RMSEWithUncertainty"`` to predict mean and
variance (aleatoric uncertainty). This overrides any ``objective`` parameter. Default is ``None``.
quantiles
Fit the model to these quantiles if the `likelihood` is set to `quantile`.
Fit the model to these quantiles if the ``likelihood`` is set to ``"quantile"`` or ``"multiquantile"``.
Default is ``None`` and will use :class:`~darts.utils.likelihood_models.sklearn.QuantileRegression`'s
default quantiles.
random_state
Controls the randomness for reproducible forecasting.
multi_models
If True, a separate model will be trained for each future lag to predict. If False, a single model
is trained to predict all the steps in 'output_chunk_length' (features lags are shifted back by
`output_chunk_length - n` for each step `n`). Default: True.
If ``True``, a separate model will be trained for each future lag to predict. If ``False``, a single model
is trained to predict all the steps in ``output_chunk_length`` (features lags are shifted back by
``output_chunk_length - n`` for each step `n`). Default: ``True``.
use_static_covariates
Whether the model should use static covariate information in case the input `series` passed to ``fit()``
contain static covariates. If ``True``, and static covariates are available at fitting time, will enforce
Expand All @@ -164,7 +169,7 @@ def encode_year(idx):
Optionally, component name or list of component names specifying the past covariates that should be treated
as categorical by the underlying `CatBoostRegressor`. The components that are specified as categorical
must be integer-encoded. For more information on how CatBoost handles categorical features,
visit: `Categorical feature support documentatio
visit: `Categorical feature support documentation
<https://catboost.ai/docs/en/features/categorical-features>`__.
categorical_future_covariates
Optionally, component name or list of component names specifying the future covariates that should be
Expand Down Expand Up @@ -265,6 +270,7 @@ def _set_likelihood(
LikelihoodType.Gaussian,
LikelihoodType.Poisson,
LikelihoodType.Quantile,
LikelihoodType.MultiQuantile,
],
)

Expand All @@ -276,7 +282,13 @@ def _set_likelihood(
"poisson": "Poisson",
"gaussian": "RMSEWithUncertainty",
}
if likelihood == LikelihoodType.Quantile.value:
# set `loss_function` and `._model_container` as per the likelihood instance
if isinstance(self._likelihood, MultiQuantileRegression):
# this condition must come before `QuantileRegression` because
# `MultiQuantileRegression` is a subclass of `QuantileRegression`
quantiles_str = ", ".join(f"{q:.3f}" for q in self._likelihood.quantiles)
self.kwargs["loss_function"] = f"MultiQuantile:alpha={quantiles_str}"
elif isinstance(self._likelihood, QuantileRegression):
self._model_container = _QuantileModelContainer()
else:
self.kwargs["loss_function"] = likelihood_map[likelihood]
Expand All @@ -293,6 +305,7 @@ def fit(
n_jobs_multioutput_wrapper: int | None = None,
sample_weight: TimeSeriesLike | str | None = None,
val_sample_weight: TimeSeriesLike | str | None = None,
stride: int = 1,
Comment thread
daidahao marked this conversation as resolved.
verbose: int | bool | None = None,
**kwargs,
):
Expand Down Expand Up @@ -335,14 +348,19 @@ def fit(
are extracted from the end of the global weights. This gives a common time weighting across all series.
val_sample_weight
Same as for `sample_weight` but for the evaluation dataset.
stride
The number of time steps between consecutive samples, applied starting from the end of the series. The same
stride will be applied to both the training and evaluation set (if supplied and supported). This should be
used with caution as it might introduce bias in the forecasts.
verbose
An integer or a boolean that can be set to 1 to display catboost's default verbose output
**kwargs
Additional kwargs passed to `catboost.CatboostRegressor.fit()`
Additional kwargs passed to `catboost.CatBoostRegressor.fit()`
"""
verbose = verbose if verbose is not None else 0
likelihood = self.likelihood
if isinstance(likelihood, QuantileRegression):
if type(likelihood) is QuantileRegression:
# must check for type `QuantileRegression` to not include subclass `MultiQuantileRegression`
# empty model container in case of multiple calls to fit, e.g. when backtesting
self._model_container.clear()
for quantile in likelihood.quantiles:
Expand All @@ -361,6 +379,7 @@ def fit(
n_jobs_multioutput_wrapper=n_jobs_multioutput_wrapper,
sample_weight=sample_weight,
val_sample_weight=val_sample_weight,
stride=stride,
verbose=verbose,
**kwargs,
)
Expand All @@ -379,6 +398,7 @@ def fit(
n_jobs_multioutput_wrapper=n_jobs_multioutput_wrapper,
sample_weight=sample_weight,
val_sample_weight=val_sample_weight,
stride=stride,
verbose=verbose,
**kwargs,
)
Expand Down Expand Up @@ -565,9 +585,9 @@ def encode_year(idx):
random_state
Controls the randomness for reproducible forecasting.
multi_models
If True, a separate model will be trained for each future lag to predict. If False, a single model
is trained to predict all the steps in 'output_chunk_length' (features lags are shifted back by
`output_chunk_length - n` for each step `n`). Default: True.
If ``True``, a separate model will be trained for each future lag to predict. If ``False``, a single model
is trained to predict all the steps in ``output_chunk_length`` (features lags are shifted back by
``output_chunk_length - n`` for each step `n`). Default: ``True``.
use_static_covariates
Whether the model should use static covariate information in case the input `series` passed to ``fit()``
contain static covariates. If ``True``, and static covariates are available at fitting time, will enforce
Expand Down
23 changes: 16 additions & 7 deletions darts/models/forecasting/lgbm.py
Original file line number Diff line number Diff line change
Expand Up @@ -137,13 +137,15 @@ def encode_year(idx):
Can be set to `quantile` or `poisson`. If set, the model will be probabilistic, allowing sampling at
prediction time. This will overwrite any `objective` parameter.
quantiles
Fit the model to these quantiles if the `likelihood` is set to `quantile`.
Fit the model to these quantiles if the ``likelihood`` is set to ``"quantile"``.
Default is ``None`` and will use :class:`~darts.utils.likelihood_models.sklearn.QuantileRegression`'s
default quantiles.
random_state
Controls the randomness for reproducible forecasting.
multi_models
If True, a separate model will be trained for each future lag to predict. If False, a single model
is trained to predict all the steps in 'output_chunk_length' (features lags are shifted back by
`output_chunk_length - n` for each step `n`). Default: True.
If ``True``, a separate model will be trained for each future lag to predict. If ``False``, a single model
is trained to predict all the steps in ``output_chunk_length`` (features lags are shifted back by
``output_chunk_length - n`` for each step `n`). Default: ``True``.
use_static_covariates
Whether the model should use static covariate information in case the input `series` passed to ``fit()``
contain static covariates. If ``True``, and static covariates are available at fitting time, will enforce
Expand Down Expand Up @@ -258,6 +260,7 @@ def fit(
n_jobs_multioutput_wrapper: int | None = None,
sample_weight: TimeSeriesLike | str | None = None,
val_sample_weight: TimeSeriesLike | str | None = None,
stride: int = 1,
verbose: bool | None = None,
**kwargs,
):
Expand Down Expand Up @@ -300,6 +303,10 @@ def fit(
are extracted from the end of the global weights. This gives a common time weighting across all series.
val_sample_weight
Same as for `sample_weight` but for the evaluation dataset.
stride
The number of time steps between consecutive samples, applied starting from the end of the series. The same
stride will be applied to both the training and evaluation set (if supplied and supported). This should be
used with caution as it might introduce bias in the forecasts.
verbose
Optionally, set the fit verbosity. Not effective for all models.
**kwargs
Expand All @@ -324,6 +331,7 @@ def fit(
n_jobs_multioutput_wrapper=n_jobs_multioutput_wrapper,
sample_weight=sample_weight,
val_sample_weight=val_sample_weight,
stride=stride,
verbose=verbose,
**kwargs,
)
Expand All @@ -342,6 +350,7 @@ def fit(
n_jobs_multioutput_wrapper=n_jobs_multioutput_wrapper,
sample_weight=sample_weight,
val_sample_weight=val_sample_weight,
stride=stride,
verbose=verbose,
**kwargs,
)
Expand Down Expand Up @@ -468,9 +477,9 @@ def encode_year(idx):
random_state
Controls the randomness for reproducible forecasting.
multi_models
If True, a separate model will be trained for each future lag to predict. If False, a single model
is trained to predict all the steps in 'output_chunk_length' (features lags are shifted back by
`output_chunk_length - n` for each step `n`). Default: True.
If ``True``, a separate model will be trained for each future lag to predict. If ``False``, a single model
is trained to predict all the steps in ``output_chunk_length`` (features lags are shifted back by
``output_chunk_length - n`` for each step `n`). Default: ``True``.
use_static_covariates
Whether the model should use static covariate information in case the input `series` passed to ``fit()``
contain static covariates. If ``True``, and static covariates are available at fitting time, will enforce
Expand Down
10 changes: 6 additions & 4 deletions darts/models/forecasting/linear_regression_model.py
Original file line number Diff line number Diff line change
Expand Up @@ -126,13 +126,15 @@ def encode_year(idx):
prediction time. If set to `quantile`, the `sklearn.linear_model.QuantileRegressor` is used. Similarly, if
set to `poisson`, the `sklearn.linear_model.PoissonRegressor` is used.
quantiles
Fit the model to these quantiles if the `likelihood` is set to `quantile`.
Fit the model to these quantiles if the ``likelihood`` is set to ``"quantile"``.
Default is ``None`` and will use :class:`~darts.utils.likelihood_models.sklearn.QuantileRegression`'s
default quantiles.
random_state
Controls the randomness for reproducible forecasting.
multi_models
If True, a separate model will be trained for each future lag to predict. If False, a single model
is trained to predict all the steps in 'output_chunk_length' (features lags are shifted back by
`output_chunk_length - n` for each step `n`). Default: True.
If ``True``, a separate model will be trained for each future lag to predict. If ``False``, a single model
is trained to predict all the steps in ``output_chunk_length`` (features lags are shifted back by
``output_chunk_length - n`` for each step `n`). Default: ``True``.
use_static_covariates
Whether the model should use static covariate information in case the input `series` passed to ``fit()``
contain static covariates. If ``True``, and static covariates are available at fitting time, will enforce
Expand Down
12 changes: 6 additions & 6 deletions darts/models/forecasting/random_forest.py
Original file line number Diff line number Diff line change
Expand Up @@ -128,9 +128,9 @@ def encode_year(idx):
The maximum depth of the tree. If None, then nodes are expanded until all leaves are pure or until all
leaves contain less than min_samples_split samples.
multi_models
If True, a separate model will be trained for each future lag to predict. If False, a single model
is trained to predict all the steps in 'output_chunk_length' (features lags are shifted back by
`output_chunk_length - n` for each step `n`). Default: True.
If ``True``, a separate model will be trained for each future lag to predict. If ``False``, a single model
is trained to predict all the steps in ``output_chunk_length`` (features lags are shifted back by
``output_chunk_length - n`` for each step `n`). Default: ``True``.
use_static_covariates
Whether the model should use static covariate information in case the input `series` passed to ``fit()``
contain static covariates. If ``True``, and static covariates are available at fitting time, will enforce
Expand Down Expand Up @@ -291,9 +291,9 @@ def encode_year(idx):
The maximum depth of the tree. If None, then nodes are expanded until all leaves are pure or until all
leaves contain less than min_samples_split samples.
multi_models
If True, a separate model will be trained for each future lag to predict. If False, a single model
is trained to predict all the steps in 'output_chunk_length' (features lags are shifted back by
`output_chunk_length - n` for each step `n`). Default: True.
If ``True``, a separate model will be trained for each future lag to predict. If ``False``, a single model
is trained to predict all the steps in ``output_chunk_length`` (features lags are shifted back by
``output_chunk_length - n`` for each step `n`). Default: ``True``.
use_static_covariates
Whether the model should use static covariate information in case the input `series` passed to ``fit()``
contain static covariates. If ``True``, and static covariates are available at fitting time, will enforce
Expand Down
Loading
Loading