feat(aggregation): Add IMTL-L by ppraneth · Pull Request #725 · SimplexLab/TorchJD

ppraneth · 2026-06-09T03:01:28Z

Adds IMTL, the loss-balancing variant (IMTL-L) of Impartial Multi-Task Learning from Towards Impartial Multi-Task Learning (ICLR 2021). It's a stateful, trainable Scalarizer.

`IMTL`

Each value $L_i$ (typically a per-task loss) is assigned a learnable scale $s_i$, and the values are combined as:

$$\sum_i \left( e^{s_i} L_i - s_i \right)$$

This is the loss-balance objective (eq. 6 in the paper, with the default $a=e, b=1$), and it matches the loss-balancing part of the LibMTL implementation (loss_scale.exp() * losses - loss_scale).

The factor $e^{s_i}$ rescales each loss so the scaled losses stay at a comparable magnitude across tasks, and the $-s_i$ term is a regularizer that prevents the trivial solution $s_i \to -\infty$. The $s_i$ are stored as an nn.Parameter, so the scalarizer's parameters must be passed to the optimizer to be learned jointly with the model.

Design notes:

shape is given at construction (IMTL(3) or IMTL((2, 3))), since the parameter has to exist before the optimizer is built. The shape is validated against the input at call time, like Constant and UW.
Scales are initialized to 0, so at the start of training the scalarization reduces to the plain sum of the values ($e^0 = 1$).
Implements reset() (from Stateful), which zeros the scales.
No positivity precondition: IMTL-L is designed for positive losses but the forward is well-defined for any input, so it isn't enforced.

Relationship to `UW` (almost equivalent)

IMTL-L is almost equivalent to UW: it equals UW up to a constant factor of two and the sign of the learned parameter, namely

$$\mathrm{IMTL}(s) = 2 \cdot \mathrm{UW}(-s)$$

(the paper notes this in Appendix C.4, where UW's regression form is written as $\tfrac{1}{2}(e^s L - s)$. They derive from different principles — UW from Gaussian/Laplace likelihoods, IMTL-L without any distribution assumption — but share the same per-task weighting and the same optima. IMTL is kept as its own discoverable class with its own direct formula; the docstring states the UW relationship, and a test locks it. The complementary gradient-balancing variant (IMTL-G) is already available as the IMTLG aggregator.

Tests

tests/unit/scalarization/test_imtl.py covers the value at init (reduces to sum(values)), int-vs-tuple shape equivalence, scalar output and gradient flow over all input shapes (0-dim, vector, matrix, higher-dim), gradient flow to log_scale, shape validation, reset(), that negative inputs are allowed, trainability via an optimizer step, the representations, and that IMTL(s) == 2 * UW(-s).

Signed-off-by: ppraneth <pranethparuchuri@gmail.com>

PierreQuinton · 2026-06-09T06:50:33Z

If they are the same, shouldn't we merge them and write a note? I'm pretty sure the factor two will just double the effective LR of s but nothing else.

I think we could in principle name it as the first implementation of the two (does any cite another?) I'm not so sure that adding duplicated methods is a good idea as it contributes noise to the library, it will also cost compute to people doing benchmarks on all methods.

Not sure what we should do.

ppraneth · 2026-06-09T06:53:42Z

@PierreQuinton I added doc strings saying that but kept code as separate as we know both are different methods right(I mean papers).

ValerianRey · 2026-06-09T09:54:28Z

If they are the same, shouldn't we merge them and write a note? I'm pretty sure the factor two will just double the effective LR of s but nothing else.

I think we could in principle name it as the first implementation of the two (does any cite another?) I'm not so sure that adding duplicated methods is a good idea as it contributes noise to the library, it will also cost compute to people doing benchmarks on all methods.

Not sure what we should do.

The reason why I wanted to have two separate classes is so that it's easy for people of the field to find the method they want to benchmark against. If they're implementing the IMTL paper, they know they need IMTL-G + IMTL-L. They will never know that they can replace IMTL-L by UW.

Also, these methods are not exactly the same, even if the difference is extremely minimal. So I guess it's ok to include this. It's not like this will happen very often I think. It's more of a reviewer's mistake to let them claim IMTL-L as novel.

Co-authored-by: Valérian Rey <31951177+ValerianRey@users.noreply.github.com>

Signed-off-by: ppraneth <pranethparuchuri@gmail.com>

ppraneth · 2026-06-09T13:36:24Z

@ValerianRey I have made the changes

ValerianRey

LGTM. @PierreQuinton are you ok with merging this? See my comment for an answer to your concerne.

PierreQuinton · 2026-06-10T05:34:30Z

Yes, the solution to my concern is an improved onboarding, which is independent from this. Thanks a lot @ppraneth !

ppraneth · 2026-06-10T05:58:31Z

@PierreQuinton How about we work on docs once we I am done with the whole scalarization package
We can make more readable docs(ideally not to technical but just enough to get a simple user onboard quickly)

ValerianRey · 2026-06-10T09:31:16Z

@PierreQuinton How about we work on docs once we I am done with the whole scalarization package We can make more readable docs(ideally not to technical but just enough to get a simple user onboard quickly)

I agree with that. I think our README is outdated and we're missing a simple getting-started tutorial. Also, we need to emphasize much more on scalarization when the package becomes more complete.

Instead of spending a lot of time explaining what jacobian descent is, I would rather say that we can either combine the losses into a scalar loss and do gradient descent, or compute every gradient and combine them into a single gradient, which is jacobian descent. Then explain a bit about the pros and cons.

add IMTL-L

f8bbb4d

Signed-off-by: ppraneth <pranethparuchuri@gmail.com>

ppraneth requested a review from a team as a code owner June 9, 2026 03:01

ppraneth added 2 commits June 9, 2026 08:31

add test cases

43f4794

Signed-off-by: ppraneth <pranethparuchuri@gmail.com>

Merge branch 'main' into scalarization-5

2dacfc9

ValerianRey added cc: feat Conventional commit type for new features. package: aggregation labels Jun 9, 2026

github-actions Bot changed the title ~~feat(scalarization)!:add IMTL-L~~ feat(aggregation): Feat(scalarization)!:add IMTL-L Jun 9, 2026

ppraneth changed the title ~~feat(aggregation): Feat(scalarization)!:add IMTL-L~~ feat(aggregation): add IMTL-L Jun 9, 2026

github-actions Bot changed the title ~~feat(aggregation): add IMTL-L~~ feat(aggregation): Add IMTL-L Jun 9, 2026

ValerianRey requested changes Jun 9, 2026

View reviewed changes

Comment thread docs/source/docs/scalarization/imtl.rst Outdated

Comment thread src/torchjd/scalarization/_imtl.py Outdated

Comment thread CHANGELOG.md Outdated

ValerianRey mentioned this pull request Jun 9, 2026

Scalarizer Tracker #667

Open

ValerianRey and others added 4 commits June 9, 2026 12:12

Merge branch 'main' into scalarization-5

24beb59

Update CHANGELOG.md

5086575

Co-authored-by: Valérian Rey <31951177+ValerianRey@users.noreply.github.com>

Update src/torchjd/scalarization/_imtl.py

c38bf4e

Co-authored-by: Valérian Rey <31951177+ValerianRey@users.noreply.github.com>

minor changes

9ce4456

Signed-off-by: ppraneth <pranethparuchuri@gmail.com>

ppraneth requested a review from ValerianRey June 9, 2026 13:36

ValerianRey approved these changes Jun 9, 2026

View reviewed changes

ValerianRey merged commit d759aed into SimplexLab:main Jun 10, 2026
17 checks passed

ValerianRey mentioned this pull request Jun 10, 2026

chore: Prepare v0.14.0 release #729

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(aggregation): Add IMTL-L#725

feat(aggregation): Add IMTL-L#725
ValerianRey merged 7 commits into
SimplexLab:mainfrom
ppraneth:scalarization-5

ppraneth commented Jun 9, 2026

Uh oh!

PierreQuinton commented Jun 9, 2026

Uh oh!

ppraneth commented Jun 9, 2026

Uh oh!

ValerianRey commented Jun 9, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ppraneth commented Jun 9, 2026

Uh oh!

ValerianRey left a comment

Uh oh!

PierreQuinton commented Jun 10, 2026

Uh oh!

ppraneth commented Jun 10, 2026

Uh oh!

ValerianRey commented Jun 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

ppraneth commented Jun 9, 2026

IMTL

Relationship to UW (almost equivalent)

Tests

Uh oh!

PierreQuinton commented Jun 9, 2026

Uh oh!

ppraneth commented Jun 9, 2026

Uh oh!

ValerianRey commented Jun 9, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ppraneth commented Jun 9, 2026

Uh oh!

ValerianRey left a comment

Choose a reason for hiding this comment

Uh oh!

PierreQuinton commented Jun 10, 2026

Uh oh!

ppraneth commented Jun 10, 2026

Uh oh!

ValerianRey commented Jun 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

`IMTL`

Relationship to `UW` (almost equivalent)