Skip to content

the-laughing-monkey/agent-rl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

agent-rl Training Scripts

Welcome to the agent-rl repository! This repo contains training scripts for Reinforcement Learning with GRPO (Group Relative Policy Optimization) using the ms‑swift framework (v3) along with bleeding edge versions of Hugging Face Transformers and vLLM.

Robo Workout
## Requirements

See below guides for different RL Frameworks (ms-swift or EasyR1) and model sizes (Qwen 2.5 VL or Qwen 2.5)

Setup and Installation

For detailed instructions on setting up a RunPod instance with 1000GB storage, the SWIFT framework, and support for Qwen 2.5 VL models, please refer to: How to RunPod with Qwen 2.5 VL Models

For instructions on setting up and running EasyR1 (a reinforcement learning framework for LLMs) on a RunPod instance with Qwen 2.5 models, please refer to: How to Run EasyR1 with Qwen 2.5 Models on RunPod

Usage

Refer to the training scripts in the scripts_train/ directory for various configurations:

  • Full Training with vLLM
  • LoRA Training with/without vLLM

Each script sets critical training parameters such as batch sizes, number of generations, and reward functions. Check the comments within each script for further details.

References

About

Scripts for training Qwen 2.5 VL with ms-swift and GRPO

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages