Skip to content

Move maybe_evaluate to grpo_utils; dedupe calculate_token_counts#1669

Merged
finbarrtimbers merged 6 commits into
mainfrom
finbarr/extract-maybe-evaluate
May 13, 2026
Merged

Move maybe_evaluate to grpo_utils; dedupe calculate_token_counts#1669
finbarrtimbers merged 6 commits into
mainfrom
finbarr/extract-maybe-evaluate

Conversation

@finbarrtimbers
Copy link
Copy Markdown
Collaborator

@finbarrtimbers finbarrtimbers commented May 8, 2026

Pure extraction, no behavior change. Sets up OLMo-core GRPO (grpo.py) to share the same eval flow as grpo_fast.py.

@finbarrtimbers finbarrtimbers force-pushed the finbarr/extract-maybe-evaluate branch from 452cc4a to b6a2af8 Compare May 8, 2026 16:51
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the GRPO implementation by moving the maybe_evaluate function and the calculate_token_counts logic from grpo_fast.py to grpo_utils.py to eliminate code duplication. It also updates the relevant imports and call sites across the codebase. The review feedback identifies typos in type-ignore comments within the moved code that should be corrected to ensure compatibility with static analysis tools.

Comment thread open_instruct/grpo_utils.py
Comment thread open_instruct/grpo_utils.py
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 452cc4a4ed

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread open_instruct/test_grpo_fast_eval.py
…uthored-By: Claude Opus 4.7 <noreply@anthropic.com>
@finbarrtimbers finbarrtimbers force-pushed the finbarr/extract-maybe-evaluate branch from b6a2af8 to 46576fd Compare May 11, 2026 18:31
import wandb
from datasets import Dataset

from open_instruct import data_loader as data_loader_lib
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we expect grpo_utils importers to always have vllm installed? This PR makes grpo_utils import data_loader at module load time, which pulls in vllm; I think that changes behavior for non-vllm paths that only use shared helpers like some of the tests. Could a lazy import inside maybe_evaluate avoid that?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't want to do a lazy import as they can make it hard to reason about the code and I think they're a bit messy; my goal is to always have all imports at the top of the file when possible.

I think that this is fine as it doesn't cause any of the CPU tests to fail, which they would as they don't have vllm installed. You're right to flag this, though! We should be careful about the imports.

@finbarrtimbers finbarrtimbers added this pull request to the merge queue May 13, 2026
Merged via the queue into main with commit b774f1a May 13, 2026
7 checks passed
@finbarrtimbers finbarrtimbers deleted the finbarr/extract-maybe-evaluate branch May 13, 2026 18:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants