-
Notifications
You must be signed in to change notification settings - Fork 178
add Temp Scheduler #1624
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
add Temp Scheduler #1624
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.
|
hey @hallerite! nice PR, excited to be supporting temperature scheduling in the orchestrator. some quick comments:
Though with async I guess we can have multiple sampling temperatures within a batch for a run with |
| self._responses_since_restart += 1 | ||
| if self._responses_since_restart >= 10 and self._restart_count > 0: | ||
| logger.debug(f"Worker '{self.worker_name}' stable after {self._responses_since_restart} responses, resetting restart count") | ||
| logger.debug( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure what is up with your ruff , did you used the pre-commit hook ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did now
Note
Introduces scheduled sampling temperature and ensures it flows end-to-end from rollout to training.
sampling.temperature_schedule(constant/linear/cosine) with validation; documents inCHANGELOG.mdcompute_temperature) and injects intoget_sampling_args; logssampling/temperatureEnvWorkernow receivessampling_argsandtemperature; responses tag rollouts withtemperature;Scheduler.set_sampling_args(...)updates args and temperature used for requestsTrainingSamplegains optionaltemperature;interleave_rollout/branch_rolloutattach itprepare_batchnow takestemperatureslist; packing sorts and groups microbatches by temperature to avoid mixing; Single/Multi packers propagate and group by temperaturetemperaturewhen computing logprobs/entropyWritten by Cursor Bugbot for commit 9ffb15e. This will update automatically on new commits. Configure here.