[wip] Fix loss reduction #909

justinvyu · 2026-01-21T01:31:54Z

No description provided.

Signed-off-by: Justin Yu <justinvyu@anyscale.com>

justinvyu · 2026-01-23T02:30:10Z

skyrl-train/skyrl_train/workers/worker.py

            self._micro_batches_accumulated += 1
+
+            # Track total tokens accumulated to compute average loss per token during optim_step
+            self._total_tokens_accumulated += micro_batch.attention_mask.sum().item()


this should actually be loss_mask

justinvyu · 2026-01-23T02:34:34Z

close in favor of #925

justinvyu added 3 commits January 20, 2026 16:20

add token_sum reduction

a55616e

Signed-off-by: Justin Yu <justinvyu@anyscale.com>

add hacky token_mean impl

02c4125

Signed-off-by: Justin Yu <justinvyu@anyscale.com>

add token_sum vs token_mean_v2

edc91f3

Signed-off-by: Justin Yu <justinvyu@anyscale.com>

justinvyu commented Jan 23, 2026

View reviewed changes

justinvyu mentioned this pull request Jan 23, 2026

[skyrl-train] Fix loss reduction by moving normalization to the advantage computation #925

Draft

justinvyu closed this Jan 23, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[wip] Fix loss reduction #909

[wip] Fix loss reduction #909

justinvyu commented Jan 21, 2026

Uh oh!

justinvyu Jan 23, 2026

Uh oh!

justinvyu commented Jan 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

[wip] Fix loss reduction #909

[wip] Fix loss reduction #909

Conversation

justinvyu commented Jan 21, 2026

Uh oh!

justinvyu Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

justinvyu commented Jan 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant