Question about pretraining duration and fine-tuning

Hi, thank you for the great work and code release.

I noticed that in your code, the agent is pretrained on the walker-walk task for 10 million steps. However, based on the evaluation rewards, it seems that the agent learns well before reaching 10M steps.

So I tried performing fine-tuning after pretraining for only 500k steps, but the performance was significantly worse.
Is there a reason why fine-tuning works better after the full 10M step pretraining?
Is it feasible to use a model pretrained for fewer steps (e.g., 1M or 2M) for fine-tuning without a significant drop in downstream performance? Or is the full pretraining necessary for good transferability?

Also, would it be possible to get access to the pretrained model weights?

Thanks in advance!

![Image](https://github.com/user-attachments/assets/499daf48-f026-4ba0-9360-9a35ccf6ab4c)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about pretraining duration and fine-tuning #1

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Question about pretraining duration and fine-tuning #1

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions