From dfe5420a174749af9a8e5a8146009ac79ca357a1 Mon Sep 17 00:00:00 2001 From: Kartavya Dikshit Date: Sun, 21 Jun 2026 03:10:26 +0200 Subject: [PATCH 1/2] Add candidate aggregators summary --- docs/candidate_aggregators.md | 7 +++++++ 1 file changed, 7 insertions(+) create mode 100644 docs/candidate_aggregators.md diff --git a/docs/candidate_aggregators.md b/docs/candidate_aggregators.md new file mode 100644 index 00000000..9fc66a47 --- /dev/null +++ b/docs/candidate_aggregators.md @@ -0,0 +1,7 @@ +# Candidate Aggregators for Multi-Task Learning + +## GradNorm +GradNorm (Gradient Normalization) is a dynamic gradient-based approach that automatically balances task training by adjusting the weight coefficients of each task's loss. It aims to equalize the gradient norms of different tasks by minimizing an auxiliary loss. This auxiliary loss penalizes the difference between the actual task gradient norm and a target norm derived from the task's relative training rate. The main goal is to ensure that no single task dominates the model updates, thus preventing overfitting and enabling more effective learning across multiple tasks. It focuses primarily on dynamic gradient magnitude tuning. + +## DB-MTL (Dual-Balancing Multi-Task Learning) +DB-MTL is a method that handles imbalances at both the loss level and the gradient level simultaneously through a "dual-balancing" strategy. For loss-scale balancing, it applies a parameter-free logarithm transformation on each task's loss to bring them to a similar scale. For gradient-magnitude balancing, it employs a training-free maximum-norm normalization strategy, which rescales all task gradients to have the same magnitude as the maximum gradient norm among the tasks. Unlike GradNorm, which uses an auxiliary loss and dynamic tuning, DB-MTL is computationally efficient (training-free) and effectively equalizes both loss and gradient scales. From f0a0bebe7260c741409804cbbaa290e0dce3c1e7 Mon Sep 17 00:00:00 2001 From: Kartavya Dikshit Date: Sun, 21 Jun 2026 07:26:52 +0200 Subject: [PATCH 2/2] docs: move candidate_aggregators.md to source and add to toctree --- docs/{ => source}/candidate_aggregators.md | 0 docs/source/index.rst | 1 + 2 files changed, 1 insertion(+) rename docs/{ => source}/candidate_aggregators.md (100%) diff --git a/docs/candidate_aggregators.md b/docs/source/candidate_aggregators.md similarity index 100% rename from docs/candidate_aggregators.md rename to docs/source/candidate_aggregators.md diff --git a/docs/source/index.rst b/docs/source/index.rst index 20d0b6db..ae75d440 100644 --- a/docs/source/index.rst +++ b/docs/source/index.rst @@ -66,6 +66,7 @@ TorchJD is open-source, under MIT License. The source code is available on installation.md examples/index.rst + candidate_aggregators.md .. toctree:: :caption: API Reference