#

ma-rlhf

Here is 1 public repository matching this topic...

ernie-research / MA-RLHF

[ICLR'25] MA-RLHF: Reinforcement Learning from Human Feedback with Macro Actions

ppo rlhf llm-training ma-rlhf posttrain

Updated Jun 6, 2025
Python

Improve this page

Add a description, image, and links to the ma-rlhf topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the ma-rlhf topic, visit your repo's landing page and select "manage topics."