Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
cfd83dd
Bump codecov/codecov-action from 5 to 6
dependabot[bot] Mar 30, 2026
916a0e3
Bump pypa/gh-action-pypi-publish from 1.13.0 to 1.14.0
dependabot[bot] Apr 13, 2026
85afeb7
Bump astral-sh/setup-uv from 7.6.0 to 8.1.0
dependabot[bot] Apr 21, 2026
b205380
[pre-commit.ci] pre-commit autoupdate
pre-commit-ci[bot] May 11, 2026
f692a02
Upgrading libraries, ensuring tests pass and any new mypy issues are …
emersodb May 13, 2026
54fe843
Monkey patch and bug fixes for the smoke tests
emersodb May 13, 2026
822a7c3
Merge branch 'dbe/upgrade_libraries' into pre-commit-ci-update-config
emersodb May 13, 2026
00f85dc
Merge pull request #531 from VectorInstitute/pre-commit-ci-update-config
emersodb May 13, 2026
8ee8fc1
Merge branch 'dbe/upgrade_libraries' into dependabot/github_actions/c…
emersodb May 13, 2026
2f70a52
Merge pull request #534 from VectorInstitute/dependabot/github_action…
emersodb May 13, 2026
a91a7ac
Merge branch 'dbe/upgrade_libraries' into dependabot/github_actions/p…
emersodb May 13, 2026
1908cf6
Merge pull request #539 from VectorInstitute/dependabot/github_action…
emersodb May 13, 2026
2bf7b60
Merge branch 'dbe/upgrade_libraries' into dependabot/github_actions/a…
emersodb May 13, 2026
be8686b
Merge pull request #542 from VectorInstitute/dependabot/github_action…
emersodb May 13, 2026
2ab191a
More upgrades for vulnerabilities
emersodb May 13, 2026
6cfcb41
Not upgrading transformers yet, as their typing seems really messed up
emersodb May 13, 2026
e72a0e6
Fixing the APFL errors but investigating why they are different
emersodb May 13, 2026
984b08a
Some apfl smoketest fixes
emersodb May 14, 2026
280e246
Fixing a few of the GPFL metrics too
emersodb May 14, 2026
bc06c19
Fixing a few of the GPFL metrics too
emersodb May 14, 2026
8bc82ee
Fixing a few of the GPFL metrics too
emersodb May 14, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/docs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ jobs:
uses: actions/checkout@v6

- name: Install uv
uses: astral-sh/setup-uv@v7.6.0
uses: astral-sh/setup-uv@v8.1.0
with:
version: "0.9.11"
enable-cache: true
Expand Down
4 changes: 2 additions & 2 deletions .github/workflows/nnunet_smoke_tests.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ jobs:
uses: actions/checkout@v6

- name: Install uv
uses: astral-sh/setup-uv@v7.6.0
uses: astral-sh/setup-uv@v8.1.0
with:
version: "0.9.11"
enable-cache: true
Expand Down Expand Up @@ -53,7 +53,7 @@ jobs:
run: uv run pytest --test-group-count=4 --test-group=${{ matrix.group }} -v --cov fl4health --cov-report=xml tests/smoke_tests/test_nnunet_smoke_tests.py

- name: Upload coverage to Codecov
uses: codecov/codecov-action@v5
uses: codecov/codecov-action@v6
with:
token: ${{ secrets.CODECOV_TOKEN }}
slug: VectorInstitute/FL4Health
Expand Down
4 changes: 2 additions & 2 deletions .github/workflows/publish_and_release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ jobs:
uses: actions/checkout@v6

- name: Install uv
uses: astral-sh/setup-uv@v7.6.0
uses: astral-sh/setup-uv@v8.1.0
with:
version: "0.9.11"
enable-cache: true
Expand All @@ -33,7 +33,7 @@ jobs:
run: uv build

- name: Publish package
uses: pypa/gh-action-pypi-publish@ed0c53931b1dc9bd32cbe73a98c7f6766f8a527e
uses: pypa/gh-action-pypi-publish@cef221092ed1bacb1cc03d23a2d87d1d172e277b
with:
user: __token__
password: ${{ secrets.PYPI_API_TOKEN }}
Expand Down
4 changes: 2 additions & 2 deletions .github/workflows/standard_smoke_tests.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ jobs:
uses: actions/checkout@v6

- name: Install uv
uses: astral-sh/setup-uv@v7.6.0
uses: astral-sh/setup-uv@v8.1.0
with:
version: "0.9.11"
enable-cache: true
Expand Down Expand Up @@ -53,7 +53,7 @@ jobs:
run: uv run pytest --test-group-count=4 --test-group=${{ matrix.group }} -v --cov fl4health --cov-report=xml tests/smoke_tests/test_standard_smoke_tests.py

- name: Upload coverage to Codecov
uses: codecov/codecov-action@v5
uses: codecov/codecov-action@v6
with:
token: ${{ secrets.CODECOV_TOKEN }}
slug: VectorInstitute/FL4Health
Expand Down
32 changes: 8 additions & 24 deletions .github/workflows/static_code_checks.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ jobs:
uses: actions/checkout@v6

- name: Install uv
uses: astral-sh/setup-uv@v7.6.0
uses: astral-sh/setup-uv@v8.1.0
with:
version: "0.9.11"
enable-cache: true
Expand All @@ -47,31 +47,15 @@ jobs:
uses: pypa/gh-action-pip-audit@v1.1.0
with:
virtual-environment: .venv/
# GHSA-3749-ghw9-m3mg and GHSA-887c-mr87-cxwp are pytorch vulnerabilities that require 2.7 and 2.8 but we're
# pinning to 2.6.0 for now.
# CVE-2025-53000 NBConvert issue, no fix yet.
# CVE-2026-21851 is a MonAI issue with no fix at the moment.
# CVE-2024-55459, CVE-2025-9906, CVE-2025-12058, CVE-2025-12060 are keras vulnerabilities that require
# keras>=3.12.0 which needs tensorflow>=2.16, but we're pinning to tensorflow 2.15 due to tensorflow-io
# compatibility constraints.
# CVE-2026-0994 is a protobuff vulnerability without a fix yet.
# CVE-2026-26007 is a cryptography vulnerability that requires cryptography>=46.0.5, but flwr (flower)
# requires cryptography<45.0.0, blocking the upgrade.
# GHSA-rf74-v2fm-23pw, CVE-2026-33230, CVE-2026-33231: NLTK vulnerabilities without a fix yet.
# CVE-2026-26007, CVE-2026-34073 are cryptography vulnerability that requires cryptography>=46.0.5, but
# flwr (flower) requires cryptography<45.0.0, blocking the upgrade.
# CVE-2026-1839 is a vulnerability in HF transformers, but they've messed up their typing in 5+ so we're
# deferring this.
ignore-vulns: |
GHSA-3749-ghw9-m3mg
GHSA-887c-mr87-cxwp
CVE-2025-53000
CVE-2026-21851
CVE-2024-55459
CVE-2025-9906
CVE-2025-12058
CVE-2025-12060
CVE-2026-0994
CVE-2026-26007
GHSA-rf74-v2fm-23pw
CVE-2026-33230
CVE-2026-33231
CVE-2026-34073

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's good to see! 👏

CVE-2026-1839


# Deleting some temporary files and useless folders to free up space
# Deleting /usr/share/dotnet should clear ~4GB of space.
Expand Down
4 changes: 2 additions & 2 deletions .github/workflows/tests.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ jobs:
uses: actions/checkout@v6

- name: Install uv
uses: astral-sh/setup-uv@v7.6.0
uses: astral-sh/setup-uv@v8.1.0
with:
version: "0.9.11"
enable-cache: true
Expand Down Expand Up @@ -48,7 +48,7 @@ jobs:
run: uv run pytest -m "not smoketest" -v --cov fl4health --cov-report=xml tests

- name: Upload coverage to Codecov
uses: codecov/codecov-action@v5
uses: codecov/codecov-action@v6
with:
token: ${{ secrets.CODECOV_TOKEN }}
slug: VectorInstitute/FL4Health
Expand Down
6 changes: 3 additions & 3 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
repos:
- repo: https://github.com/astral-sh/uv-pre-commit
rev: 0.10.12
rev: 0.11.13
hooks:
- id: uv-lock

Expand All @@ -26,7 +26,7 @@ repos:
- id: check-toml

- repo: https://github.com/astral-sh/ruff-pre-commit
rev: 'v0.15.7'
rev: 'v0.15.12'
hooks:
- id: ruff-check
args: [--fix, --exit-non-zero-on-fix]
Expand All @@ -35,7 +35,7 @@ repos:
types_or: [python, jupyter]

- repo: https://github.com/pre-commit/mirrors-mypy
rev: v1.19.1
rev: v2.0.0
hooks:
- id: mypy
name: mypy
Expand Down
13 changes: 10 additions & 3 deletions fl4health/clients/flexible/nnunet.py
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@
from fl4health.utils.nnunet_utils import (
NNUNET_DEFAULT_NP,
NNUNET_N_SPATIAL_DIMS,
LocalPolyLRScheduler,
Module2LossWrapper,
NnunetConfig,
NnUNetDataLoaderWrapper,
Expand All @@ -56,6 +57,7 @@
# silences a bunch of deprecation warnings related to scipy.ndimage
# Raised an issue with nnunet. https://github.com/MIC-DKFZ/nnUNet/issues/2370
warnings.filterwarnings("ignore", category=DeprecationWarning)
import nnunetv2.training.nnUNetTrainer.nnUNetTrainer as nnUNetTrainerModule
from batchgenerators.utilities.file_and_folder_operations import (
load_json,
save_json,
Expand All @@ -72,7 +74,6 @@
from nnunetv2.training.dataloading.utils import unpack_dataset
from nnunetv2.training.loss.deep_supervision import DeepSupervisionWrapper
from nnunetv2.training.lr_scheduler.polylr import PolyLRScheduler
from nnunetv2.training.nnUNetTrainer.nnUNetTrainer import nnUNetTrainer
from nnunetv2.utilities.dataset_name_id_conversion import convert_id_to_dataset_name

# grpcio currently has a log spamming bug that seems to be triggered by multithreading/multiprocessing
Expand Down Expand Up @@ -100,7 +101,7 @@ def __init__(
checkpoint_and_state_module: ClientCheckpointAndStateModule | None = None,
reporters: Sequence[BaseReporter] | None = None,
client_name: str | None = None,
nnunet_trainer_class: type[nnUNetTrainer] = nnUNetTrainer,
nnunet_trainer_class: type[nnUNetTrainerModule.nnUNetTrainer] = nnUNetTrainerModule.nnUNetTrainer,
nnunet_trainer_class_kwargs: dict[str, Any] | None = None,
) -> None:
"""
Expand Down Expand Up @@ -208,7 +209,7 @@ def __init__(
# nnunet specific attributes to be initialized in setup_client
self.nnunet_trainer_class = nnunet_trainer_class
self.nnunet_trainer_class_kwargs = nnunet_trainer_class_kwargs or {}
self.nnunet_trainer: nnUNetTrainer
self.nnunet_trainer: nnUNetTrainerModule.nnUNetTrainer
self.nnunet_config: NnunetConfig
self.plans: dict[str, Any] | None = None
self.steps_per_round: int # N steps per server round
Expand Down Expand Up @@ -620,6 +621,12 @@ def setup_client(self, config: Config) -> None:
device=self.device,
**self.nnunet_trainer_class_kwargs,
)

# NOTE: Monkey Patch to force the nnunet_trainer to use our version of the PolyLRScheduler instead of
# NnUnet version. This is because NnUnet's hasn't updated their scheduler to the new torch signature and
# does not appear to intend to do so. The fix is very minimum. So we patch it here.
nnUNetTrainerModule.PolyLRScheduler = LocalPolyLRScheduler

# nnunet_trainer initialization
self.nnunet_trainer.initialize()
# This is done by nnunet_trainer in self.on_train_start, we
Expand Down
13 changes: 10 additions & 3 deletions fl4health/clients/nnunet_client.py
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@
from fl4health.utils.nnunet_utils import (
NNUNET_DEFAULT_NP,
NNUNET_N_SPATIAL_DIMS,
LocalPolyLRScheduler,
Module2LossWrapper,
NnunetConfig,
NnUNetDataLoaderWrapper,
Expand All @@ -50,6 +51,7 @@
# silences a bunch of deprecation warnings related to scipy.ndimage
# Raised an issue with nnunet. https://github.com/MIC-DKFZ/nnUNet/issues/2370
warnings.filterwarnings("ignore", category=DeprecationWarning)
import nnunetv2.training.nnUNetTrainer.nnUNetTrainer as nnUNetTrainerModule
from batchgenerators.utilities.file_and_folder_operations import load_json, save_json
from nnunetv2.experiment_planning.experiment_planners.default_experiment_planner import ExperimentPlanner
from nnunetv2.experiment_planning.plan_and_preprocess_api import extract_fingerprints, preprocess_dataset
Expand All @@ -58,7 +60,6 @@
from nnunetv2.training.dataloading.utils import unpack_dataset
from nnunetv2.training.loss.deep_supervision import DeepSupervisionWrapper
from nnunetv2.training.lr_scheduler.polylr import PolyLRScheduler
from nnunetv2.training.nnUNetTrainer.nnUNetTrainer import nnUNetTrainer
from nnunetv2.utilities.dataset_name_id_conversion import convert_id_to_dataset_name

# grpcio currently has a log spamming bug that seems to be triggered by multithreading/multiprocessing
Expand Down Expand Up @@ -86,7 +87,7 @@ def __init__(
checkpoint_and_state_module: ClientCheckpointAndStateModule | None = None,
reporters: Sequence[BaseReporter] | None = None,
client_name: str | None = None,
nnunet_trainer_class: type[nnUNetTrainer] = nnUNetTrainer,
nnunet_trainer_class: type[nnUNetTrainerModule.nnUNetTrainer] = nnUNetTrainerModule.nnUNetTrainer,
nnunet_trainer_class_kwargs: dict[str, Any] | None = None,
) -> None:
"""
Expand Down Expand Up @@ -191,7 +192,7 @@ def __init__(
self.nnunet_trainer_class_kwargs = (
nnunet_trainer_class_kwargs if nnunet_trainer_class_kwargs is not None else {}
)
self.nnunet_trainer: nnUNetTrainer
self.nnunet_trainer: nnUNetTrainerModule.nnUNetTrainer
self.nnunet_config: NnunetConfig
self.plans: dict[str, Any] | None = None
self.steps_per_round: int # N steps per server round
Expand Down Expand Up @@ -585,6 +586,12 @@ def setup_client(self, config: Config) -> None:
device=self.device,
**self.nnunet_trainer_class_kwargs,
)

# NOTE: Monkey Patch to force the nnunet_trainer to use our version of the PolyLRScheduler instead of
# NnUnet version. This is because NnUnet's hasn't updated their scheduler to the new torch signature and
# does not appear to intend to do so. The fix is very minimum. So we patch it here.
nnUNetTrainerModule.PolyLRScheduler = LocalPolyLRScheduler

# nnunet_trainer initialization
self.nnunet_trainer.initialize()
# This is done by nnunet_trainer in self.on_train_start, we
Expand Down
7 changes: 7 additions & 0 deletions fl4health/mixins/personalized/ditto.py
Original file line number Diff line number Diff line change
Expand Up @@ -124,6 +124,13 @@ def _copy_optimizer_with_new_params(self: DittoPersonalizedProtocol, original_op

optimizer_kwargs = {k: v for k, v in param_group.items() if k not in ("params", "initial_lr")}
assert self.global_model is not None

# NOTE: This is a small workaround for torch back-compatibility in AdamW. Torch injects a key (that isn't part
# of the class signature) into the param groups called "decoupled_weight_decay" which causes an error in the
# kwargs below. See: https://github.com/pytorch/pytorch/blob/v2.11.0/torch/optim/adamw.py#L57
if optim_class == torch.optim.AdamW:
optimizer_kwargs.pop("decoupled_weight_decay")

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weird that we need a workaround for their workaround. What is the case where this happens?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically, they got rid of this in the constructor, but if you follow the link, you'll see they "inject" it into the dictionary. To make this mixin work, Andrei was reconstructing the optimizer arguments to create a new one from the old one. The injection means that there is a key corresponding to the argument that used to be called decoupled_weight_decay. So it gets into the kwargs which are passed to the constructor which no longer wants decoupled_weight_decay.

The way Andrei was doing this is fairly brittle, but I don't want to mess with it much in this PR.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, sounds a little hacky, but I agree with you about not messing with it here.


global_optimizer = optim_class(self.global_model.parameters(), **optimizer_kwargs)

# maintain initial_lr for schedulers
Expand Down
14 changes: 8 additions & 6 deletions fl4health/model_bases/masked_layers/masked_conv.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
from __future__ import annotations

from typing import Literal

import torch
import torch.nn.functional as F
from torch import Tensor, nn
Expand All @@ -21,7 +23,7 @@ def __init__(
dilation: _size_1_t = 1,
groups: int = 1,
bias: bool = True,
padding_mode: str = "zeros",
padding_mode: Literal["zeros", "reflect", "replicate", "circular"] = "zeros",
device: torch.device | None = None,
dtype: torch.dtype | None = None,
) -> None:
Expand Down Expand Up @@ -150,7 +152,7 @@ def __init__(
dilation: _size_2_t = 1,
groups: int = 1,
bias: bool = True,
padding_mode: str = "zeros",
padding_mode: Literal["zeros", "reflect", "replicate", "circular"] = "zeros",
device: torch.device | None = None,
dtype: torch.dtype | None = None,
) -> None:
Expand Down Expand Up @@ -276,7 +278,7 @@ def __init__(
dilation: _size_3_t = 1,
groups: int = 1,
bias: bool = True,
padding_mode: str = "zeros",
padding_mode: Literal["zeros", "reflect", "replicate", "circular"] = "zeros",
device: torch.device | None = None,
dtype: torch.dtype | None = None,
) -> None:
Expand Down Expand Up @@ -403,7 +405,7 @@ def __init__(
groups: int = 1,
bias: bool = True,
dilation: _size_1_t = 1,
padding_mode: str = "zeros",
padding_mode: Literal["zeros", "reflect", "replicate", "circular"] = "zeros",
device: torch.device | None = None,
dtype: torch.dtype | None = None,
) -> None:
Expand Down Expand Up @@ -566,7 +568,7 @@ def __init__(
groups: int = 1,
bias: bool = True,
dilation: _size_2_t = 1,
padding_mode: str = "zeros",
padding_mode: Literal["zeros", "reflect", "replicate", "circular"] = "zeros",
device: torch.device | None = None,
dtype: torch.dtype | None = None,
) -> None:
Expand Down Expand Up @@ -727,7 +729,7 @@ def __init__(
groups: int = 1,
bias: bool = True,
dilation: _size_3_t = 1,
padding_mode: str = "zeros",
padding_mode: Literal["zeros", "reflect", "replicate", "circular"] = "zeros",
device: torch.device | None = None,
dtype: torch.dtype | None = None,
) -> None:
Expand Down
Loading
Loading