-
Notifications
You must be signed in to change notification settings - Fork 358
Support multiple dataloader for grpo #1603
Copy link
Copy link
Closed
Labels
data moduleenhancementNew feature or requestNew feature or requestresearchTag for research team's issuesTag for research team's issues
Description
yuki-97
opened on Dec 5, 2025
Issue body actions
- Support multiple dataloaders for multiple datasets so that we can control how much to load from each dataset.
- Provide an interface and a simplified example of how to control the ratio of each dataset
- E.g. at one training step, we can load 2 subbatches from dataloader1 and 3 subbatches from dataloader2. Then in the final training batch, the corresponding task ratio will be 2:3.
- The implementation will be similar to custom-plarallel-plan: write the custom logic in a file, then point to that file in the config.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
data moduleenhancementNew feature or requestNew feature or requestresearchTag for research team's issuesTag for research team's issues