Search, understand, reproduce, and improve an idea with ease
-
Updated
Apr 18, 2026 - Python
Search, understand, reproduce, and improve an idea with ease
Build RL environments for LLM training
Build, Evaluate, and Deploy GUI Agents — online RL training, standardized benchmarks, and real-device deployment in one framework.
Pytorch implementation of "A Deep Reinforced Model for Abstractive Summarization" paper and pointer generator network
An easy python package to run quick basic QA evaluations. This package includes standardized QA evaluation metrics and semantic evaluation metrics: Black-box and Open-Source large language model prompting and evaluation, exact match, F1 Score, PEDANT semantic match, transformer match. Our package also supports prompting OPENAI and Anthropic API.
Simulator for training and evaluation of Recommender Systems
grpo to train long form QA and instructions with long-form reward model
MAS-Orchestra: train, inspect, and vibe-code multi-agent systems with RL-learned orchestration and MASBench
The official PyTorch implementation for the Guided by Gut: Efficient Test-Time Scaling with Reinforced Intrinsic Confidence
Deep Reinforced Model for Abstractive Summarization
A reinforcement learning environment for AI coding agents built from real Git history. RepoGym automatically extracts bug-fix tasks from open-source repositories, runs agent patches in Docker sandboxes, and returns reward signals based on test suite delta.
Add a description, image, and links to the rl-training topic page so that developers can more easily learn about it.
To associate your repository with the rl-training topic, visit your repo's landing page and select "manage topics."