unit8co · dennisbader · Feb 27, 2026 · Feb 23, 2026 · Feb 23, 2026 · Feb 24, 2026
@@ -14,10 +14,12 @@ but cannot always guarantee backwards compatibility. Changes that may **break co
 - 🚀🚀 Added new forecasting model `NeuralForecastModel` to convert any of the 30+ NeuralForecast base model into a Darts `TorchForecastingModel`. This includes models such as NBEATSx, PatchTST, TimeXer, KAN, and many more. Like all Darts torch models, it supports univariate, multivariate, probabilistic forecasting, optimized backtesting and more. Depending on the base model, it also supports past, future, and static covariates. [#3002](https://github.com/unit8co/darts/pull/3002) by [Zhihao Dai](https://github.com/daidahao)
   - Check out our new [NeuralForecastModel Notebook](https://unit8co.github.io/darts/examples/26-NeuralForecast-examples.html) for detailed examples. [#3026](https://github.com/unit8co/darts/pull/3026) by [Dennis Bader](https://github.com/dennisbader).
 - Created `darts.typing` to collect typical type annotation in one place. Introduced `TimeIndex` & `TimeSeriesLike` type aliases for improved readability & maintainability of the code. Commmon type annotations can be added to this file in the future. [#3021](https://github.com/unit8co/darts/pull/3021) by [Michel Zeller](https://github.com/mizeller)
+- More fine-grained control over Reversible Instance Normalization for all torch models. Apart from the boolean trigger, parameter `use_reversible_instance_norm` now also supports setting the `RINorm` hyperparameters as a dictionary. [#3029](https://github.com/unit8co/darts/pull/3029) by [Zhihao Dai](https://github.com/daidahao).
 
 **Fixed**
 
 - Updated the restrictive type hint for the timezone parameter `tz` to `Any`. This allows the use of more timezone definitions supported by Pandas [tz_convert](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DatetimeIndex.tz_convert.html). [#3015](https://github.com/unit8co/darts/pull/3015) by [Moritz Waldleben](https://github.com/mwaldleben).
+- Disallowed `use_reversible_instance_norm=True` or `{"affine": True}` for foundation models to prevent checkpoint loading errors due to incompatible weights. [#3029](https://github.com/unit8co/darts/pull/3029) by [Zhihao Dai](https://github.com/daidahao).
 
 - Fixed all instances of [invalid-parameter-default](https://docs.astral.sh/ty/reference/rules/#invalid-parameter-default) errors. Improves type checker's ability to accurately reason about the code. [#3027](https://github.com/unit8co/darts/pull/3027) by [Michel Zeller](https://github.com/mizeller)
 

@@ -348,7 +348,9 @@ def __init__(
             Optionally, some keyword arguments for the PyTorch learning rate scheduler. Default: ``None``.
         use_reversible_instance_norm
             Whether to use reversible instance normalization `RINorm` against distribution shift as shown in [1]_.
-            It is only applied to the features of the target series and not the covariates.
+            It is only applied to the features of the target series and not the covariates. If ``True``,
+            applies ``RINorm`` with default hyperparameters. If a dictionary, defines the hyperparameters to construct
+            the ``RINorm``. Supported parameters are ``{"affine": bool, "eps": float}``. Default: ``False``.
         batch_size
             Number of time series (input and output sequences) used in each training pass. Default: ``32``.
         n_epochs

@@ -667,9 +667,6 @@ def __init__(
             to using a constant learning rate. Default: ``None``.
         lr_scheduler_kwargs
             Optionally, some keyword arguments for the PyTorch learning rate scheduler. Default: ``None``.
-        use_reversible_instance_norm
-            Whether to use reversible instance normalization `RINorm` against distribution shift. Ignored by
-            Chronos-2 as it has its own `RINorm` implementation.
         batch_size
             Number of time series (input and output sequences) used in each training pass. Default: ``32``.
         n_epochs

@@ -305,7 +305,9 @@ def __init__(
             Optionally, some keyword arguments for the PyTorch learning rate scheduler. Default: ``None``.
         use_reversible_instance_norm
             Whether to use reversible instance normalization `RINorm` against distribution shift as shown in [2]_.
-            It is only applied to the features of the target series and not the covariates.
+            It is only applied to the features of the target series and not the covariates. If ``True``,
+            applies ``RINorm`` with default hyperparameters. If a dictionary, defines the hyperparameters to construct
+            the ``RINorm``. Supported parameters are ``{"affine": bool, "eps": float}``. Default: ``False``.
         batch_size
             Number of time series (input and output sequences) used in each training pass. Default: ``32``.
         n_epochs

@@ -173,6 +173,27 @@ def encode_year(idx):
                 logger,
             )
 
+        use_reversible_instance_norm: bool | dict = self.pl_module_params.get(
+            "use_reversible_instance_norm", False
+        )
+        if use_reversible_instance_norm is True or (
+            isinstance(use_reversible_instance_norm, dict)
+            and use_reversible_instance_norm.get("affine", True)
+        ):
+            if use_reversible_instance_norm is True:
+                use_reversible_instance_norm = dict(affine=False)
+            else:
+                use_reversible_instance_norm["affine"] = False
+            logger.warning(
+                f"By default, Reversible Instance Normalization (RINorm) in Darts inserts affine transformation "
+                f"weights, which do not exist in foundation model checkpoints. To prevent incompatible model "
+                f"weights when loading checkpoints, `use_reversible_instance_norm` is overridden to "
+                f"`{use_reversible_instance_norm}`."
+            )
+            self.pl_module_params["use_reversible_instance_norm"] = (
+                use_reversible_instance_norm
+            )
+
         self._enable_finetuning = enable_finetuning
 
     @property

@@ -639,7 +639,9 @@ def __init__(
             Optionally, some keyword arguments for the PyTorch learning rate scheduler. Default: ``None``.
         use_reversible_instance_norm
             Whether to use reversible instance normalization `RINorm` against distribution shift as shown in [2]_.
-            It is only applied to the features of the target series and not the covariates.
+            It is only applied to the features of the target series and not the covariates. If ``True``,
+            applies ``RINorm`` with default hyperparameters. If a dictionary, defines the hyperparameters to construct
+            the ``RINorm``. Supported parameters are ``{"affine": bool, "eps": float}``. Default: ``False``.
         batch_size
             Number of time series (input and output sequences) used in each training pass. Default: ``32``.
         n_epochs

@@ -465,7 +465,9 @@ def __init__(
             Optionally, some keyword arguments for the PyTorch learning rate scheduler. Default: ``None``.
         use_reversible_instance_norm
             Whether to use reversible instance normalization `RINorm` against distribution shift as shown in [2]_.
-            It is only applied to the features of the target series and not the covariates.
+            It is only applied to the features of the target series and not the covariates. If ``True``,
+            applies ``RINorm`` with default hyperparameters. If a dictionary, defines the hyperparameters to construct
+            the ``RINorm``. Supported parameters are ``{"affine": bool, "eps": float}``. Default: ``False``.
         batch_size
             Number of time series (input and output sequences) used in each training pass. Default: ``32``.
         n_epochs

@@ -573,7 +573,9 @@ def __init__(
             Optionally, some keyword arguments for the PyTorch learning rate scheduler. Default: ``None``.
         use_reversible_instance_norm
             Whether to use reversible instance normalization `RINorm` against distribution shift as shown in [2]_.
-            It is only applied to the features of the target series and not the covariates.
+            It is only applied to the features of the target series and not the covariates. If ``True``,
+            applies ``RINorm`` with default hyperparameters. If a dictionary, defines the hyperparameters to construct
+            the ``RINorm``. Supported parameters are ``{"affine": bool, "eps": float}``. Default: ``False``.
         batch_size
             Number of time series (input and output sequences) used in each training pass. Default: ``32``.
         n_epochs

@@ -273,7 +273,9 @@ def __init__(
             Optionally, some keyword arguments for the PyTorch learning rate scheduler. Default: ``None``.
         use_reversible_instance_norm
             Whether to use reversible instance normalization `RINorm` against distribution shift as shown in [2]_.
-            It is only applied to the features of the target series and not the covariates.
+            It is only applied to the features of the target series and not the covariates. If ``True``,
+            applies ``RINorm`` with default hyperparameters. If a dictionary, defines the hyperparameters to construct
+            the ``RINorm``. Supported parameters are ``{"affine": bool, "eps": float}``. Default: ``False``.
         batch_size
             Number of time series (input and output sequences) used in each training pass. Default: ``32``.
         n_epochs

@@ -94,7 +94,7 @@ def __init__(
         optimizer_kwargs: dict | None = None,
         lr_scheduler_cls: torch.optim.lr_scheduler._LRScheduler | None = None,
         lr_scheduler_kwargs: dict | None = None,
-        use_reversible_instance_norm: bool = False,
+        use_reversible_instance_norm: bool | dict = False,
     ) -> None:
         """
         PyTorch Lightning-based Forecasting Module.
@@ -150,7 +150,9 @@ def __init__(
             Optionally, some keyword arguments for the PyTorch learning rate scheduler. Default: ``None``.
         use_reversible_instance_norm
             Whether to use reversible instance normalization `RINorm` against distribution shift as shown in [1]_.
-            It is only applied to the features of the target series and not the covariates.
+            It is only applied to the features of the target series and not the covariates. If ``True``,
+            applies ``RINorm`` with default hyperparameters. If a dictionary, defines the hyperparameters to construct
+            the ``RINorm``. Supported parameters are ``{"affine": bool, "eps": float}``. Default: ``False``.
 
         References
         ----------
@@ -206,8 +208,10 @@ def __init__(
 
         # reversible instance norm
         self.use_reversible_instance_norm = use_reversible_instance_norm
-        if use_reversible_instance_norm:
+        if use_reversible_instance_norm is True:
             self.rin = RINorm(input_dim=self.n_targets)
+        elif isinstance(use_reversible_instance_norm, dict):
+            self.rin = RINorm(input_dim=self.n_targets, **use_reversible_instance_norm)
         else:
             self.rin = None
 

@@ -338,7 +338,9 @@ def __init__(
             Optionally, some keyword arguments for the PyTorch learning rate scheduler. Default: ``None``.
         use_reversible_instance_norm
             Whether to use reversible instance normalization `RINorm` against distribution shift as shown in [2]_.
-            It is only applied to the features of the target series and not the covariates.
+            It is only applied to the features of the target series and not the covariates. If ``True``,
+            applies ``RINorm`` with default hyperparameters. If a dictionary, defines the hyperparameters to construct
+            the ``RINorm``. Supported parameters are ``{"affine": bool, "eps": float}``. Default: ``False``.
         batch_size
             Number of time series (input and output sequences) used in each training pass. Default: ``32``.
         n_epochs

@@ -783,8 +783,10 @@ def __init__(
         lr_scheduler_kwargs
             Optionally, some keyword arguments for the PyTorch learning rate scheduler. Default: ``None``.
         use_reversible_instance_norm
-            Whether to use reversible instance normalization `RINorm` against distribution shift as shown in [3]_.
-            It is only applied to the features of the target series and not the covariates.
+            Whether to use reversible instance normalization `RINorm` against distribution shift as shown in [2]_.
+            It is only applied to the features of the target series and not the covariates. If ``True``,
+            applies ``RINorm`` with default hyperparameters. If a dictionary, defines the hyperparameters to construct
+            the ``RINorm``. Supported parameters are ``{"affine": bool, "eps": float}``. Default: ``False``.
         batch_size
             Number of time series (input and output sequences) used in each training pass. Default: ``32``.
         n_epochs

@@ -478,7 +478,9 @@ def __init__(
             Optionally, some keyword arguments for the PyTorch learning rate scheduler. Default: ``None``.
         use_reversible_instance_norm
             Whether to use reversible instance normalization `RINorm` against distribution shift as shown in [2]_.
-            It is only applied to the features of the target series and not the covariates.
+            It is only applied to the features of the target series and not the covariates. If ``True``,
+            applies ``RINorm`` with default hyperparameters. If a dictionary, defines the hyperparameters to construct
+            the ``RINorm``. Supported parameters are ``{"affine": bool, "eps": float}``. Default: ``False``.
         batch_size
             Number of time series (input and output sequences) used in each training pass. Default: ``32``.
         n_epochs

@@ -399,9 +399,6 @@ def __init__(
             to using a constant learning rate. Default: ``None``.
         lr_scheduler_kwargs
             Optionally, some keyword arguments for the PyTorch learning rate scheduler. Default: ``None``.
-        use_reversible_instance_norm
-            Whether to use reversible instance normalization `RINorm` against distribution shift as shown in [3]_.
-            It is only applied to the features of the target series and not the covariates.
         batch_size
             Number of time series (input and output sequences) used in each training pass. Default: ``32``.
         n_epochs
@@ -512,8 +509,6 @@ def encode_year(idx):
                 arXiv https://arxiv.org/abs/2310.10688.
         .. [2] "A decoder-only foundation model for time-series forecasting", 2024. Google Research.
                 https://research.google/blog/a-decoder-only-foundation-model-for-time-series-forecasting/
-        .. [3] T. Kim et al. "Reversible Instance Normalization for Accurate Time-Series Forecasting against
-                Distribution Shift", https://openreview.net/forum?id=cGDAkQo1C0p
 
         Examples
         --------

@@ -426,7 +426,9 @@ def __init__(
             Optionally, some keyword arguments for the PyTorch learning rate scheduler. Default: ``None``.
         use_reversible_instance_norm
             Whether to use reversible instance normalization `RINorm` against distribution shift as shown in [3]_.
-            It is only applied to the features of the target series and not the covariates.
+            It is only applied to the features of the target series and not the covariates. If ``True``,
+            applies ``RINorm`` with default hyperparameters. If a dictionary, defines the hyperparameters to construct
+            the ``RINorm``. Supported parameters are ``{"affine": bool, "eps": float}``. Default: ``False``.
         batch_size
             Number of time series (input and output sequences) used in each training pass. Default: ``32``.
         n_epochs

@@ -615,7 +615,9 @@ def __init__(
             Optionally, some keyword arguments for the PyTorch learning rate scheduler. Default: ``None``.
         use_reversible_instance_norm
             Whether to use reversible instance normalization `RINorm` against distribution shift as shown in [2]_.
-            It is only applied to the features of the target series and not the covariates.
+            It is only applied to the features of the target series and not the covariates. If ``True``,
+            applies ``RINorm`` with default hyperparameters. If a dictionary, defines the hyperparameters to construct
+            the ``RINorm``. Supported parameters are ``{"affine": bool, "eps": float}``. Default: ``False``.
         batch_size
             Number of time series (input and output sequences) used in each training pass. Default: ``32``.
         n_epochs

@@ -101,6 +101,52 @@ def test_invalid_params(self, mock_method):
                 **tfm_kwargs,
             )
 
+    @patch(
+        "darts.models.components.huggingface_connector.hf_hub_download",
+        side_effect=mock_download,
+    )
+    @pytest.mark.parametrize(
+        "user_rin, expected_rin",
+        [
+            (True, {"affine": False}),
+            ({"eps": 1e-7}, {"affine": False, "eps": 1e-7}),
+            ({"affine": True}, {"affine": False}),
+            ({"eps": 1e-9, "affine": True}, {"affine": False, "eps": 1e-9}),
+            ({"affine": False}, {"affine": False}),
+            ({"eps": 1e-8, "affine": False}, {"eps": 1e-8, "affine": False}),
+            (False, False),
+        ],
+    )
+    def test_rinorm(self, mock_method, caplog, user_rin, expected_rin):
+        """Checks that RINorm works, and that affine=True is overridden to affine=False."""
+        # `affine=True` is overridden to `affine=False`
+        affine_override = False
+        if user_rin is True or (
+            isinstance(user_rin, dict) and user_rin.get("affine", True)
+        ):
+            affine_override = True
+
+        # `use_reversible_instance_norm` is overridden to `use_reversible_instance_norm={"affine": False}`
+        with caplog.at_level(logging.WARNING):
+            model = Chronos2Model(
+                input_chunk_length=12,
+                output_chunk_length=6,
+                use_reversible_instance_norm=user_rin,
+                **tfm_kwargs,
+            )
+
+        assert (
+            "`use_reversible_instance_norm` is overridden to" in caplog.text
+        ) is affine_override
+        # RINorm affine transformation is disabled
+        assert model.pl_module_params["use_reversible_instance_norm"] == expected_rin
+        model.fit(series=self.series)
+
+        if user_rin:
+            assert model.model.rin.affine is False
+        else:
+            assert model.model.rin is None
+
     @patch(
         "darts.models.components.huggingface_connector.hf_hub_download",
         side_effect=mock_download,