Verify HF export at startup; rewrite save_state_dict_as_hf#1671
Conversation
There was a problem hiding this comment.
Code Review
This pull request replaces the olmo-core HF saving path with a direct conversion flow, adds startup verification for HF exports, and introduces a pruning mechanism for permanent checkpoints. Feedback identifies a critical AttributeError due to incorrect StateType usage, notes that the checkpoint pruning parameter is not yet propagated through the call stack, and highlights redundant configuration saving logic and an unused parameter.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 905a12faf8
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…t_state_to_hf; prune permanent checkpoints Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…noreply@anthropic.com>
6bfecf1 to
370c2f1
Compare
…verrides shim now that upstream handles llama/qwen3/gemma3 norm mappings Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
… config save (PR #1671 review) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
… Opus 4.7 <noreply@anthropic.com>
…ude Opus 4.7 <noreply@anthropic.com>
…o f-string Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
farhatkevin
left a comment
There was a problem hiding this comment.
This looks good to me once the merge conflicts are resolved and the stale PruningCheckpointerCallback mention is removed from the changelog.
…ne guards Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…thored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ield ordering Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…i#1672) * Move maybe_evaluate to grpo_utils; dedupe calculate_token_counts Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * Verify HF export at startup; rewrite save_state_dict_as_hf via convert_state_to_hf; prune permanent checkpoints Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * Update CHANGELOG with PR allenai#1671 link Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * GRPO OLMo-core feature parity: EvalCallback, setup_eval, checkpointer, scheduler types Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * Update CHANGELOG with PR allenai#1672 link Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * Drop pre-norm Qwen3/Llama OLMo-core->HF override shim; upstream olmo-core now provides these mappings Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * Minimize diff: drop PruningCheckpointerCallback, ty:ignore, and unrelated script tweaks Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * Unify checkpoint_state_freq disable sentinel: <=0 disables on both grpo_fast and olmo_core paths Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * Apply suggestion from @gemini-code-assist[bot] Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> * Update grpo_callbacks.py * Use keyword args in EvalCallback.post_step's maybe_evaluate call Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * Move save_freq/checkpoint_state_freq divergence warning into GRPOExperimentConfig.__post_init__ Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Replace olmo-core's
save_hf_modelpath with a directconvert_state_to_hf→AutoModelForCausalLM.save_pretrainedflow, so HF export reuses the same converter the verifier checks against.Add
verify_can_save_as_hfwhich builds the olmo-core model and the target HF model on meta, runs the converter, and asserts the produced state-dict keys exactly match the HF model.