unit8co · dennisbader · Mar 7, 2026 · Dec 18, 2025 · Dec 19, 2025 · Jan 15, 2026
@@ -96,7 +96,7 @@ jobs:
     runs-on: ubuntu-latest
     strategy:
       matrix:
-        example-name: [00-quickstart.ipynb, 01-multi-time-series-and-covariates.ipynb, 02-data-processing.ipynb, 03-FFT-examples.ipynb, 04-RNN-examples.ipynb, 05-TCN-examples.ipynb, 06-Transformer-examples.ipynb, 07-NBEATS-examples.ipynb, 08-DeepAR-examples.ipynb, 09-DeepTCN-examples.ipynb, 10-Kalman-filter-examples.ipynb, 11-GP-filter-examples.ipynb, 12-Dynamic-Time-Warping-example.ipynb, 13-TFT-examples.ipynb, 15-static-covariates.ipynb, 16-hierarchical-reconciliation.ipynb, 18-TiDE-examples.ipynb, 19-EnsembleModel-examples.ipynb, 20-SKLearnModel-examples.ipynb, 21-TSMixer-examples.ipynb, 22-anomaly-detection-examples.ipynb, 23-Conformal-Prediction-examples.ipynb, 24-SKLearnClassifierModel-examples.ipynb, 25-Chronos-2-examples.ipynb, 26-NeuralForecast-examples.ipynb]
+        example-name: [00-quickstart.ipynb, 01-multi-time-series-and-covariates.ipynb, 02-data-processing.ipynb, 03-FFT-examples.ipynb, 04-RNN-examples.ipynb, 05-TCN-examples.ipynb, 06-Transformer-examples.ipynb, 07-NBEATS-examples.ipynb, 08-DeepAR-examples.ipynb, 09-DeepTCN-examples.ipynb, 10-Kalman-filter-examples.ipynb, 11-GP-filter-examples.ipynb, 12-Dynamic-Time-Warping-example.ipynb, 13-TFT-examples.ipynb, 15-static-covariates.ipynb, 16-hierarchical-reconciliation.ipynb, 18-TiDE-examples.ipynb, 19-EnsembleModel-examples.ipynb, 20-SKLearnModel-examples.ipynb, 21-TSMixer-examples.ipynb, 22-anomaly-detection-examples.ipynb, 23-Conformal-Prediction-examples.ipynb, 24-SKLearnClassifierModel-examples.ipynb, 25-Chronos-2-examples.ipynb, 26-NeuralForecast-examples.ipynb, 27-Torch-and-Foundation-Model-Fine-Tuning-examples.ipynb]
     steps:
       - name: "Clone repository"
         uses: actions/checkout@v4

@@ -13,6 +13,8 @@ but cannot always guarantee backwards compatibility. Changes that may **break co
 
 - 🚀🚀 Added new forecasting model `NeuralForecastModel` to convert any of the 30+ NeuralForecast base model into a Darts `TorchForecastingModel`. This includes models such as NBEATSx, PatchTST, TimeXer, KAN, and many more. Like all Darts torch models, it supports univariate, multivariate, probabilistic forecasting, optimized backtesting and more. Depending on the base model, it also supports past, future, and static covariates. [#3002](https://github.com/unit8co/darts/pull/3002) by [Zhihao Dai](https://github.com/daidahao)
   - Check out our new [NeuralForecastModel Notebook](https://unit8co.github.io/darts/examples/26-NeuralForecast-examples.html) for detailed examples. [#3026](https://github.com/unit8co/darts/pull/3026) by [Dennis Bader](https://github.com/dennisbader).
+- 🚀🚀 Added support for fine-tuning to all `TorchForecastingModel` and `FoundationModel` (such as `Chronos2Model` and `TimesFM2p5Model`) via the new `enable_finetuning` parameter. Supports full training, and partial fine-tuning by selectively freezing or unfreezing layers by name pattern. [#2964](https://github.com/unit8co/darts/issues/2964) by [Alain Gysi](https://github.com/Kurokabe).
+  - Check out our new [Fine-Tuning Notebook](https://unit8co.github.io/darts/examples/27-Torch-and-Foundation-Model-Fine-Tuning-examples.html) for detailed examples.
 - Created `darts.typing` to collect typical type annotation in one place. Introduced `TimeIndex` & `TimeSeriesLike` type aliases for improved readability & maintainability of the code. Commmon type annotations can be added to this file in the future. [#3021](https://github.com/unit8co/darts/pull/3021) by [Michel Zeller](https://github.com/mizeller)
 - More fine-grained control over Reversible Instance Normalization for all torch models. Apart from the boolean trigger, parameter `use_reversible_instance_norm` now also supports setting the `RINorm` hyperparameters as a dictionary. [#3029](https://github.com/unit8co/darts/pull/3029) by [Zhihao Dai](https://github.com/daidahao).
 

@@ -12,9 +12,7 @@
 from safetensors.torch import load_file
 
 from darts.logging import get_logger, raise_log
-from darts.models.forecasting.pl_forecasting_module import (
-    PLForecastingModule,
-)
+from darts.models.forecasting.pl_forecasting_module import PLForecastingModule
 
 logger = get_logger(__name__)
 

@@ -454,6 +454,18 @@ def encode_year(idx):
         show_warnings
             whether to show warnings raised from PyTorch Lightning. Useful to detect potential issues of
             your forecasting use case. Default: ``False``.
+        enable_finetuning
+            Enables model fine-tuning. Only effective if not ``None``.
+            If a bool, specifies whether to perform full fine-tuning / training (all parameters are updated) or keep
+            all parameters frozen. If a dict, specifies which parameters to fine-tune. Must only contain one key-value
+            record. Can be used to:
+
+            - Unfreeze specific parameters, while keeping everything else frozen:
+              ``{"unfreeze": ["param.name.patterns.*"]}``
+            - Freeze specific parameters, while keeping everything else unfrozen:
+              ``{"freeze": ["param.name.patterns.*"]}``
+
+            Default: ``None``.
 
         References
         ----------

@@ -6,6 +6,8 @@
 
 * `Chronos-2 Foundation Model Examples
   <https://unit8co.github.io/darts/examples/25-Chronos-2-examples.html>`__
+* `Fine-Tuning Examples
+  <https://unit8co.github.io/darts/examples/27-Torch-and-Foundation-Model-Fine-Tuning-examples.html>`__
 """
 
 import math
@@ -23,14 +25,11 @@
     _Patch,
     _ResidualBlock,
 )
-from darts.models.components.huggingface_connector import (
-    HuggingFaceConnector,
-)
-from darts.models.forecasting.foundation_model import (
-    FoundationModel,
-)
+from darts.models.components.huggingface_connector import HuggingFaceConnector
+from darts.models.forecasting.foundation_model import FoundationModel
 from darts.models.forecasting.pl_forecasting_module import (
     PLForecastingModule,
+    io_processor,
 )
 from darts.utils.data.torch_datasets.utils import PLModuleInput, TorchTrainingSample
 from darts.utils.likelihood_models.torch import QuantileRegression
@@ -99,7 +98,8 @@ def __init__(
             all parameters required for :class:`darts.models.forecasting.pl_forecasting_module.PLForecastingModule`
             base class.
         """
-
+        # for fine-tuning, model should be trained on pre-trained quantiles
+        enable_finetuning = kwargs.pop("enable_finetuning", False)
         super().__init__(**kwargs)
         self.d_model = d_model
         self.d_kv = d_kv
@@ -192,14 +192,23 @@ def __init__(
         quantiles_tensor = torch.tensor(quantiles)
         self.register_buffer("quantiles", quantiles_tensor, persistent=False)
 
-        # gather indices of user-specified quantiles
+        # gather indices of user-specified quantiles (used at prediction time)
         user_quantiles: list[float] = (
             self.likelihood.quantiles
             if isinstance(self.likelihood, QuantileRegression)
             else [0.5]
         )
         self.user_quantile_indices = [quantiles.index(q) for q in user_quantiles]
 
+        # during fine-tuning, train on ALL pre-trained quantiles to preserve the
+        # full distribution; prediction uses only user-specified quantiles
+        if enable_finetuning:
+            self._finetuning_likelihood = QuantileRegression(quantiles)
+            self._finetuning_quantile_indices = list(range(self.num_quantiles))
+        else:
+            self._finetuning_likelihood = None
+            self._finetuning_quantile_indices = None
+
         self.output_patch_embedding = _ResidualBlock(
             in_dim=self.d_model,
             h_dim=self.d_ff,
@@ -461,6 +470,7 @@ def _forward(
     # 3. Chronos-2 uses normalized values for loss computation, while Darts uses denormalized values.
     # We need to think about how best to implement Chronos-2 `RINorm` in `io_processor()` without
     # breaking existing behavior, while also allowing fine-tuning with normalized loss.
+    @io_processor
     def forward(self, x_in: PLModuleInput, *args, **kwargs) -> Any:
         """Chronos-2 model forward pass.
 
@@ -549,17 +559,26 @@ def forward(self, x_in: PLModuleInput, *args, **kwargs) -> Any:
         # select only target variables
         quantile_preds = quantile_preds[:, :, : self.n_targets, :]
 
-        # select only user-specified quantiles or median if deterministic
-        quantile_preds = quantile_preds[:, :, :, self.user_quantile_indices]
+        # during training (fine-tuning), output all pre-trained quantiles for loss;
+        # during prediction, output only user-specified quantiles
+        if self.training:
+            quantile_preds = quantile_preds[:, :, :, self._finetuning_quantile_indices]
+        else:
+            quantile_preds = quantile_preds[:, :, :, self.user_quantile_indices]
 
         return quantile_preds
 
+    def _compute_loss(self, output, target, criterion, sample_weight):
+        if self.training:
+            # compute loss on pre-trained quantiles
+            return self._finetuning_likelihood.compute_loss(
+                output, target, sample_weight
+            )
+        else:
+            return super()._compute_loss(output, target, criterion, sample_weight)
 
-class Chronos2Model(FoundationModel):
-    # Fine-tuning is turned off for now pending proper fine-tuning support
-    # and configuration.
-    _allows_finetuning = False
 
+class Chronos2Model(FoundationModel):
     def __init__(
         self,
         input_chunk_length: int,
@@ -607,6 +626,11 @@ def __init__(
         below for details. It is recommended to call :func:`predict()` with ``predict_likelihood_parameters=True``
         or ``num_samples >> 1`` to get meaningful results.
 
+        .. tip::
+            You can perform full or partial fine-tuning of the model by setting the ``enable_finetuning`` parameter.
+            Read more in the parameter description below and in the `Fine-Tuning Examples
+            <https://unit8co.github.io/darts/examples/27-Torch-and-Foundation-Model-Fine-Tuning-examples.html>`__.
+
         Parameters
         ----------
         input_chunk_length
@@ -635,6 +659,9 @@ def __init__(
             [0.01, 0.05, 0.1, 0.15, 0.2, 0.25, 0.3, 0.35, 0.4, 0.45, 0.5, 0.55, 0.6, 0.65, 0.7, 0.75, 0.8, 0.85, 0.9,
             0.95, 0.99].
             Default: ``None``, which will make Chronos-2 deterministic (median quantile only).
+            When fine-tuning is enabled, the training loss is always computed on all pre-trained quantiles to
+            preserve the full distribution, regardless of the ``likelihood`` setting. The ``likelihood`` parameter
+            only affects prediction output.
         hub_model_name
             The model ID on HuggingFace Hub. Default: ``"amazon/chronos-2"``. Other available variants include
             ``"autogluon/chronos-2-small"`` and ``"autogluon/chronos-2-synth"``.
@@ -770,6 +797,18 @@ def encode_year(idx):
         show_warnings
             whether to show warnings raised from PyTorch Lightning. Useful to detect potential issues of
             your forecasting use case. Default: ``False``.
+        enable_finetuning
+            Enables model fine-tuning. Only effective if not ``None``.
+            If a bool, specifies whether to perform full fine-tuning / training (all parameters are updated) or keep
+            all parameters frozen. If a dict, specifies which parameters to fine-tune. Must only contain one key-value
+            record. Can be used to:
+
+            - Unfreeze specific parameters, while keeping everything else frozen:
+              ``{"unfreeze": ["param.name.patterns.*"]}``
+            - Freeze specific parameters, while keeping everything else unfrozen:
+              ``{"freeze": ["param.name.patterns.*"]}``
+
+            Default: ``None``.
 
         References
         ----------
@@ -810,8 +849,6 @@ def encode_year(idx):
         [[1005.6928 ]]
         [[1005.69617]]]
 
-        .. note::
-            Fine-tuning of Chronos-2 is not supported at the moment.
         .. note::
             Chronos-2 is licensed under the `Apache-2.0 License <https://github.com/amazon-science/chronos-forecasting/blob/main/LICENSE>`_,
             copyright Amazon.com, Inc. or its affiliates. By using this model, you agree to the terms and conditions of
@@ -878,7 +915,7 @@ def encode_year(idx):
                 )
 
         self.hf_connector = hf_connector
-        super().__init__(enable_finetuning=False, **kwargs)
+        super().__init__(**kwargs)
 
     def _create_model(self, train_sample: TorchTrainingSample) -> PLForecastingModule:
         pl_module_params = self.pl_module_params or {}

@@ -411,6 +411,18 @@ def encode_year(idx):
         show_warnings
             whether to show warnings raised from PyTorch Lightning. Useful to detect potential issues of
             your forecasting use case. Default: ``False``.
+        enable_finetuning
+            Enables model fine-tuning. Only effective if not ``None``.
+            If a bool, specifies whether to perform full fine-tuning / training (all parameters are updated) or keep
+            all parameters frozen. If a dict, specifies which parameters to fine-tune. Must only contain one key-value
+            record. Can be used to:
+
+            - Unfreeze specific parameters, while keeping everything else frozen:
+              ``{"unfreeze": ["param.name.patterns.*"]}``
+            - Freeze specific parameters, while keeping everything else unfrozen:
+              ``{"freeze": ["param.name.patterns.*"]}``
+
+            Default: ``None``.
 
         References
         ----------

@@ -10,22 +10,14 @@
 
 from abc import ABC
 
-from darts.logging import get_logger, raise_log
-from darts.models.forecasting.torch_forecasting_model import (
-    MixedCovariatesTorchModel,
-)
+from darts.logging import get_logger
+from darts.models.forecasting.torch_forecasting_model import MixedCovariatesTorchModel
 
 logger = get_logger(__name__)
 
 
 class FoundationModel(MixedCovariatesTorchModel, ABC):
-    _allows_finetuning: bool = False
-
-    def __init__(
-        self,
-        enable_finetuning: bool = False,
-        **kwargs,
-    ):
+    def __init__(self, **kwargs):
         """Foundation Forecasting Model with PyTorch Lightning backend.
 
         This class is meant to be inherited to create a new foundation forecasting model.
@@ -46,11 +38,14 @@ def __init__(
         instantiate a :class:`HuggingFaceConnector` and use its methods to load the model configuration
         inside :func:`__init__()` and to load the model weights inside :func:`_create_model()`.
 
+
+        .. tip::
+            You can perform full or partial fine-tuning of the model by setting the ``enable_finetuning`` parameter.
+            Read more in the parameter description below and in the `Fine-Tuning Examples
+            <https://unit8co.github.io/darts/examples/27-Torch-and-Foundation-Model-Fine-Tuning-examples.html>`__.
+
         Parameters
         ----------
-        enable_finetuning
-            Whether to enable fine-tuning of the foundation model. If set to ``True``, calling :func:`fit()` will
-            update the model weights. Default: ``False``.
         batch_size
             Number of time series (input and output sequences) used in each fine-tuning pass. Default: ``32``.
         n_epochs
@@ -156,22 +151,33 @@ def encode_year(idx):
         show_warnings
             whether to show warnings raised from PyTorch Lightning. Useful to detect potential issues of
             your forecasting use case. Default: ``False``.
+        enable_finetuning
+            Enables model fine-tuning. Only effective if not ``None``.
+            If a bool, specifies whether to perform full fine-tuning / training (all parameters are updated) or keep
+            all parameters frozen. If a dict, specifies which parameters to fine-tune. Must only contain one key-value
+            record. Can be used to:
+
+            - Unfreeze specific parameters, while keeping everything else frozen:
+              ``{"unfreeze": ["param.name.patterns.*"]}``
+            - Freeze specific parameters, while keeping everything else unfrozen:
+              ``{"freeze": ["param.name.patterns.*"]}``
+
+            Default: ``None``.
         """
+        # Set default fine-tuning to False for foundation models
+        if "enable_finetuning" not in self.model_params:
+            self.model_params["enable_finetuning"] = False
+
         # initialize `TorchForecastingModel` base class
         super().__init__(**self._extract_torch_model_params(**self.model_params))
 
         # extract pytorch lightning module kwargs
         self.pl_module_params = self._extract_pl_module_params(**self.model_params)
 
-        # validate and set fine-tuning flag
-        if enable_finetuning and not self._allows_finetuning:
-            raise_log(
-                ValueError(
-                    f"Fine-tuning is not supported for {self.__class__.__name__}."
-                    " Please set `enable_finetuning=False`."
-                ),
-                logger,
-            )
+        # pass fine-tuning flag to the PLModule so it can set up training-specific
+        # quantile handling (separate from prediction-time likelihood)
+        if self.enable_finetuning:
+            self.pl_module_params["enable_finetuning"] = True
 
         use_reversible_instance_norm: bool | dict = self.pl_module_params.get(
             "use_reversible_instance_norm", False
@@ -193,9 +199,3 @@ def encode_year(idx):
             self.pl_module_params["use_reversible_instance_norm"] = (
                 use_reversible_instance_norm
             )
-
-        self._enable_finetuning = enable_finetuning
-
-    @property
-    def _requires_training(self) -> bool:
-        return self._enable_finetuning
@@ -745,6 +745,18 @@ def encode_year(idx):
         show_warnings
             whether to show warnings raised from PyTorch Lightning. Useful to detect potential issues of
             your forecasting use case. Default: ``False``.
+        enable_finetuning
+            Enables model fine-tuning. Only effective if not ``None``.
+            If a bool, specifies whether to perform full fine-tuning / training (all parameters are updated) or keep
+            all parameters frozen. If a dict, specifies which parameters to fine-tune. Must only contain one key-value
+            record. Can be used to:
+
+            - Unfreeze specific parameters, while keeping everything else frozen:
+              ``{"unfreeze": ["param.name.patterns.*"]}``
+            - Freeze specific parameters, while keeping everything else unfrozen:
+              ``{"freeze": ["param.name.patterns.*"]}``
+
+            Default: ``None``.
 
         References
         ----------

@@ -571,6 +571,18 @@ def encode_year(idx):
         show_warnings
             whether to show warnings raised from PyTorch Lightning. Useful to detect potential issues of
             your forecasting use case. Default: ``False``.
+        enable_finetuning
+            Enables model fine-tuning. Only effective if not ``None``.
+            If a bool, specifies whether to perform full fine-tuning / training (all parameters are updated) or keep
+            all parameters frozen. If a dict, specifies which parameters to fine-tune. Must only contain one key-value
+            record. Can be used to:
+
+            - Unfreeze specific parameters, while keeping everything else frozen:
+              ``{"unfreeze": ["param.name.patterns.*"]}``
+            - Freeze specific parameters, while keeping everything else unfrozen:
+              ``{"freeze": ["param.name.patterns.*"]}``
+
+            Default: ``None``.
 
         References
         ----------