Very creative and interesting work. Can you open source the training code of the reward model?
Very creative and interesting work. Can you open source the training code of the reward model?