Add Cosine LR Scheduler for Fine-Tuning#273
Conversation
src/together/cli/api/finetune.py
Outdated
There was a problem hiding this comment.
| help="Number of cycles for cosine learning rate scheduler.", | |
| help="Number of cycles for the cosine learning rate scheduler.", |
There was a problem hiding this comment.
nit: Maybe also add what fractional values mean.
There was a problem hiding this comment.
Updated to "Number or fraction of cycles".
src/together/resources/finetune.py
Outdated
There was a problem hiding this comment.
nit: maybe also add some meaningful upperbound?
There was a problem hiding this comment.
Hard to know what a reasonable upperbound would be without knowing the number of steps afaik. I think it makes sense to follow the hf implementation and let the cosine alias if the user inputs something unreasonably large for their job.
src/together/resources/finetune.py
Outdated
There was a problem hiding this comment.
Let's call it scheduler_num_cycles for clarity?
There was a problem hiding this comment.
The backend expects the field in the request to be num_cycles, but in the scheduler args. Would just changing the CLI arg be ok?
There was a problem hiding this comment.
Changed CLI arg to scheduler_num_cycles. Worth noting FinetuneCosineLRSChedulerArgs still has the field as num_cycles
Have you read the Contributing Guidelines?
Yes
Issue # ENG-23270
Adds support for Cosine LR Scheduler in the fine-tuning api as well as simplifies future support of additional schedulers (https://github.com/togethercomputer/together-finetune/pull/1349)