-
Notifications
You must be signed in to change notification settings - Fork 20
Upgrading FL4Health Libraries #558
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
cfd83dd
916a0e3
85afeb7
b205380
f692a02
54fe843
822a7c3
00f85dc
8ee8fc1
2f70a52
a91a7ac
1908cf6
2bf7b60
be8686b
2ab191a
6cfcb41
e72a0e6
984b08a
280e246
bc06c19
8bc82ee
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -124,6 +124,13 @@ def _copy_optimizer_with_new_params(self: DittoPersonalizedProtocol, original_op | |
|
|
||
| optimizer_kwargs = {k: v for k, v in param_group.items() if k not in ("params", "initial_lr")} | ||
| assert self.global_model is not None | ||
|
|
||
| # NOTE: This is a small workaround for torch back-compatibility in AdamW. Torch injects a key (that isn't part | ||
| # of the class signature) into the param groups called "decoupled_weight_decay" which causes an error in the | ||
| # kwargs below. See: https://github.com/pytorch/pytorch/blob/v2.11.0/torch/optim/adamw.py#L57 | ||
| if optim_class == torch.optim.AdamW: | ||
| optimizer_kwargs.pop("decoupled_weight_decay") | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Weird that we need a workaround for their workaround. What is the case where this happens?
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Basically, they got rid of this in the constructor, but if you follow the link, you'll see they "inject" it into the dictionary. To make this mixin work, Andrei was reconstructing the optimizer arguments to create a new one from the old one. The injection means that there is a key corresponding to the argument that used to be called The way Andrei was doing this is fairly brittle, but I don't want to mess with it much in this PR.
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, sounds a little hacky, but I agree with you about not messing with it here. |
||
|
|
||
| global_optimizer = optim_class(self.global_model.parameters(), **optimizer_kwargs) | ||
|
|
||
| # maintain initial_lr for schedulers | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's good to see! 👏