Add checkpoint_name_prefix to RL training Config#192
Add checkpoint_name_prefix to RL training Config#192pourion wants to merge 1 commit intothinking-machines-lab:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR adds an optional checkpoint_name_prefix field to the RL training Config class to improve checkpoint identification when running multiple experiments. The prefix is prepended to the batch number when saving checkpoints, making it easier to match checkpoints with their corresponding experiments.
- Added
checkpoint_name_prefixfield to theConfigdataclass with default value ofNone - Updated checkpoint saving logic to use the prefix when constructing checkpoint names
- Updated all call sites to pass the prefix parameter through the call chain
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
tinker_cookbook/rl/train.py
Outdated
| @@ -770,7 +776,7 @@ async def compute_full_batch_metrics_and_get_sampling_client( | |||
|
|
|||
| # Get a sampling client using the new weights | |||
| sampling_client, checkpoint_metrics = await save_checkpoint_and_get_sampling_client( | |||
| training_client, i_batch, log_path, save_every | |||
| training_client, i_batch, log_path, save_every, checkpoint_name_prefix=checkpoint_name_prefix | |||
There was a problem hiding this comment.
Inconsistent parameter passing style. This call uses a keyword argument for checkpoint_name_prefix, while other calls to save_checkpoint_and_get_sampling_client in this PR use positional arguments (see lines 332, 911, 959, 982). For consistency and clarity, consider using the same style throughout. Either use positional arguments or keyword arguments consistently for the new parameter.
| training_client, i_batch, log_path, save_every, checkpoint_name_prefix=checkpoint_name_prefix | |
| training_client, i_batch, log_path, save_every, checkpoint_name_prefix |
ad82e0b to
14e71dc
Compare
Adds an optional `checkpoint_name_prefix` field to the Config class that
prefixes all checkpoint names. This makes it easier to identify checkpoints
on the Tinker platform by experiment/run name.
Example usage:
Config(
checkpoint_name_prefix="my_experiment_dec19",
...
)
Results in checkpoints named like:
- my_experiment_dec19_000020
- my_experiment_dec19_000040
- my_experiment_dec19_final
Instead of:
- 000020
- 000040
- final
14e71dc to
778ffd4
Compare
Summary
Adds an optional
checkpoint_name_prefixfield to the RL training Config that prefixes all saved checkpoint names.Motivation
When running multiple experiments, checkpoints saved to Tinker are difficult to identify because they're named only by batch number (e.g.,
000042). This change allows prefixing with an experiment identifier (e.g.,my_exp_dec19_000042), making it easy to match checkpoints to WandB runs.Changes
checkpoint_name_prefix: str | None = NonetoConfigclasssave_checkpoint_and_get_sampling_clientto accept and use the prefixUsage
Config(
checkpoint_name_prefix="exp_rl_dec19",
wandb_name="exp_rl_dec19",
...
)
Backward Compatibility
Fully backward compatible - the field defaults to None, preserving existing behavior.