Hello~
Dear authors:
I'm interested in your pioneering work and wonder how long will you release checkpoints of pretrained models, including the world model and base policy model to support other researchers' reproduction.
Now, there is no way to reproduce your results lack of pretrained models, right? Or is there an experimental reproduction method that I have failed to perceive? Could you kindly enlighten me?
Besides, another minor question is, your RFT is based on the Base(15w) model, right? Will you release the weights of the Base(3w) model?
Best regards.