Skip to content

Makes grpo.py support checkpointing, and adds a fix for mason.py#1666

Merged
finbarrtimbers merged 10 commits into
mainfrom
finbarr/mason-replace-or-append-flag
May 11, 2026
Merged

Makes grpo.py support checkpointing, and adds a fix for mason.py#1666
finbarrtimbers merged 10 commits into
mainfrom
finbarr/mason-replace-or-append-flag

Conversation

@finbarrtimbers
Copy link
Copy Markdown
Collaborator

@finbarrtimbers finbarrtimbers commented May 8, 2026

Summary

  • Add replace_or_append_flag(command, flag, value) helper in mason.py so the auto-overrides for --output_dir and --checkpoint_state_dir replace any existing occurrence of the flag in the user's command instead of blindly appending. Previously, passing the flag explicitly produced a duplicated entry, and argparse picked one of the two values inconsistently.
  • Collapses repeated occurrences of the same flag down to a single one with the new value, including the trailing-flag-without-value edge case.
  • Make build_command_without_args flag-aware: a value-bearing flag now consumes the following token only if it doesn't itself start with --. replace_or_append_flag is a one-line delegation on top of it.
  • Add open_instruct/grpo.py to OPEN_INSTRUCT_COMMANDS and OPEN_INSTRUCT_RESUMABLES so mason recognizes it the same way it does grpo_fast.py, and wire OLMo-core checkpoint save/resume into grpo.py (CheckpointerCallback + DataPreparationActorCheckpointCallback + LoadStrategy.if_available) so resumable Beaker jobs actually resume instead of silently restarting from step 0.
  • Parameterized unit tests for the helper covering empty/absent/present-once/present-twice/trailing-flag/adjacent-flags/flag-followed-by-flag cases, plus two new cases for build_command_without_args covering the same edge cases.

Carved out of #1642 to keep the change small and reviewable.

Example fixed by the build_command_without_args change

Given a command where --checkpoint_state_dir is immediately followed by another flag (a flag with no value, e.g. an interrupted/edited launch line):

build_command_without_args(
    ["python", "grpo.py", "--checkpoint_state_dir", "--with_tracking", "--output", "out"],
    {"--checkpoint_state_dir": True},
)
  • Before: ["python", "grpo.py", "--output", "out"]--with_tracking got silently eaten as if it were the checkpoint dir's value.
  • After: ["python", "grpo.py", "--with_tracking", "--output", "out"]--with_tracking is preserved.

The same fix means replace_or_append_flag(["--output_dir", "--output_dir", "/tmp/z"], "--output_dir", "/weka/x") now returns ["--output_dir", "/weka/x"] instead of leaking the orphaned /tmp/z token.

Test plan

  • uv run pytest test_mason.py
  • make style && make quality
  • End-to-end checkpoint-resume integration test on Beaker (scripts/train/debug/grpo_checkpoint_integration_test.sh): run 1 trains 6 steps and writes checkpoints; run 2 resumes from step6 and trains to step 12. Run 2 logs confirm Loading checkpoint from '.../step6' and Will resume training from step 6, epoch 1.

Runs:

  1. Run 1 (train 6 steps + checkpoint): Beaker
  2. Run 2 (resume from step 6 → step 12): Beaker

GPU_TESTS=bypass

…t; add grpo.py to OPEN_INSTRUCT_COMMANDS Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@finbarrtimbers finbarrtimbers force-pushed the finbarr/mason-replace-or-append-flag branch from 758e360 to e78d601 Compare May 8, 2026 16:43
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 758e36034f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread mason.py
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds open_instruct/grpo.py to the supported commands and resumables and introduces a replace_or_append_flag utility to handle idempotent flag overrides for --output_dir and --checkpoint_state_dir. Review feedback points out a logic bug in the new utility when dealing with adjacent flags and suggests a more robust implementation by reusing existing command-building logic. It also recommends expanding the test suite to cover these edge cases.

Comment thread mason.py Outdated
Comment thread test_mason.py
…; wire OLMo-core checkpointing into grpo.py Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@finbarrtimbers finbarrtimbers changed the title Make mason.py output_dir/checkpoint_state_dir overrides idempotent Makes grpo.py support checkpointing, and adds a fix for mason.py May 8, 2026
…d_flag to delegate Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…nd at call sites Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…and_without_args Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
… for grpo.py Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…of nonexistent restore_state Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@finbarrtimbers finbarrtimbers enabled auto-merge May 8, 2026 19:29
@finbarrtimbers finbarrtimbers requested a review from hamishivi May 8, 2026 19:29
Comment thread open_instruct/grpo_callbacks.py
@finbarrtimbers finbarrtimbers added this pull request to the merge queue May 8, 2026
@hamishivi hamishivi removed this pull request from the merge queue due to a manual request May 8, 2026
@finbarrtimbers finbarrtimbers enabled auto-merge May 11, 2026 15:29
@finbarrtimbers finbarrtimbers added this pull request to the merge queue May 11, 2026
Merged via the queue into main with commit 308c411 May 11, 2026
7 checks passed
@finbarrtimbers finbarrtimbers deleted the finbarr/mason-replace-or-append-flag branch May 11, 2026 15:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants