Skip to content

Update RL training skill#633

Merged
Kovbo merged 2 commits intomainfrom
fix/update-train-skills
Mar 27, 2026
Merged

Update RL training skill#633
Kovbo merged 2 commits intomainfrom
fix/update-train-skills

Conversation

@Kovbo
Copy link
Copy Markdown
Collaborator

@Kovbo Kovbo commented Mar 27, 2026

Summary

  • rewrite the RL training skill into a shorter interactive wizard that still collects required choices one question at a time
  • update RL guidance to use batch-scaled max_exceptions, explicit validation/checkpoint guidance, openai/gpt-5.4 as the default RULER judge, and neutral base-model prompting

Testing

  • not run (skill/documentation changes only)

@Kovbo Kovbo changed the title Update RL and SFT training skills Update RL training skill Mar 27, 2026
@Kovbo Kovbo merged commit 1905677 into main Mar 27, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant