Skip to content

Conversation

@azahed98
Copy link
Contributor

Have you read the Contributing Guidelines?
Yes

Issue # ENG-23270

Adds support for Cosine LR Scheduler in the fine-tuning api as well as simplifies future support of additional schedulers (https://github.com/togethercomputer/together-finetune/pull/1349)

@mryab mryab requested review from VProv and removed request for mryab March 18, 2025 21:16
"--num-cycles",
type=float,
default=0.5,
help="Number of cycles for cosine learning rate scheduler.",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
help="Number of cycles for cosine learning rate scheduler.",
help="Number of cycles for the cosine learning rate scheduler.",

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Maybe also add what fractional values mean.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to "Number or fraction of cycles".

lr_scheduler_args=FinetuneLinearLRSchedulerArgs(min_lr_ratio=min_lr_ratio),
)
if lr_scheduler_type == "cosine":
if num_cycles <= 0.0:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe also add some meaningful upperbound?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hard to know what a reasonable upperbound would be without knowing the number of steps afaik. I think it makes sense to follow the hf implementation and let the cosine alias if the user inputs something unreasonably large for their job.

learning_rate: float | None = 0.00001,
lr_scheduler_type: Literal["linear", "cosine"] = "linear",
min_lr_ratio: float = 0.0,
num_cycles: float = 0.5,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's call it scheduler_num_cycles for clarity?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The backend expects the field in the request to be num_cycles, but in the scheduler args. Would just changing the CLI arg be ok?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed CLI arg to scheduler_num_cycles. Worth noting FinetuneCosineLRSChedulerArgs still has the field as num_cycles

@azahed98 azahed98 merged commit 10ad24d into main Mar 25, 2025
9 checks passed
@azahed98 azahed98 deleted the arsh/ft-cosine-lr branch March 25, 2025 07:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants