Skip to content

Integrate eval protocol to rllm trainer#292

Merged
1stprinciple merged 8 commits intomainfrom
2048
Oct 27, 2025
Merged

Integrate eval protocol to rllm trainer#292
1stprinciple merged 8 commits intomainfrom
2048

Conversation

@1stprinciple
Copy link
Copy Markdown
Collaborator

@1stprinciple 1stprinciple commented Oct 25, 2025

Note

Guard signal handler registration to main thread and move LiteLLM policy initialization outside server start for consistent retries and setup.

  • Pytest MCP server manager (default_mcp_gym_rollout_processor.py):
    • Guard SIGINT/SIGTERM registration to the main thread using threading.current_thread() checks.
    • Add threading import.
  • MCP gym rollout processor:
    • Refactor LiteLLMPolicy creation to occur after server start (and on retries), rather than inside the server start block.
    • Relax retry precondition to require existing server only (no policy check).
    • Keep environment creation and rollout execution unchanged aside from using the newly created policy.

Written by Cursor Bugbot for commit 339e8ce. This will update automatically on new commits. Configure here.

@1stprinciple 1stprinciple requested a review from xzrderek October 27, 2025 18:09
@1stprinciple 1stprinciple marked this pull request as ready for review October 27, 2025 18:09
cursor[bot]

This comment was marked as outdated.

Copy link
Copy Markdown
Contributor

@xzrderek xzrderek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks!

max_tokens=max_tokens,
**extra_body,
**other_params,
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Server Cleanup Fails on Policy Creation Error

Policy initialization code was moved outside the exception handling block. When start_server=False (retry scenario), if policy creation fails (lines 242-260), there is no exception handler to clean up the server that was started in a previous call. This can leave the MCP server running indefinitely without cleanup, causing resource leaks. The exception handler at lines 228-233 only catches exceptions during server.start() when start_server=True, not during policy creation when start_server=False.

Fix in Cursor Fix in Web

@1stprinciple 1stprinciple merged commit 680e719 into main Oct 27, 2025
8 of 9 checks passed
@1stprinciple 1stprinciple deleted the 2048 branch October 27, 2025 18:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants