RL training #8

Open

Labels

enhancementexploratory

justinchiu-cohere

opened

Longer term, we need to run RL experiments on minimization. This will require two things: 1. a good base policy 2. good RL code. We have to wait a bit for both.

Metadata

Assignees

No one assigned

Labels

enhancementexploratory

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests