Skip to content

grpo #10

@cangyi071

Description

@cangyi071

有更多关于grpo训练的细节吗?比如数据合成和奖励函数的设计。

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions