Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,4 +37,5 @@ Documenting changes which affect configuration usage patterns (added/moved/remov
- **`orchestrator.env.log`**: Removed. Use `orchestrator.log` for env worker logging instead (2026-01-15)
- **`orchestrator.eval.retry.reraise`**: Changed default from `True` to `False`. When `False`, raises `tenacity.RetryError` after retries are exhausted instead of the original exception, allowing failed eval environments to be skipped with a warning (#1586, 2026-01-14)
- **`model.ep`**: Expert parallelism now supported (with auto/custom impl only), changed from the old behaviour when `ep>1` was a no-op to a proper parallelization of the MoE layers. (#1595, 2026-01-15)
- **`orchestrator.reload_weights_on_start`**: Added flag to control resetting inference weights to the base model when starting from scratch (default: True) (2026-01-21)
- **`orchestrator.client.elastic`**: Added elastic inference pool with DNS-based service discovery. Supports dynamic server scaling via any DNS hostname with multiple A records (Kubernetes headless services, Consul, Route53, etc.). Automatically syncs LoRA adapters on new servers and only exposes ready servers to workers (#1617, 2026-01-19)
8 changes: 8 additions & 0 deletions src/prime_rl/orchestrator/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -647,6 +647,14 @@ class OrchestratorConfig(BaseSettings):
# The checkpoint configuration
ckpt: CheckpointConfig | None = None

# Whether to reset inference weights to base model when starting from scratch
reload_weights_on_start: Annotated[
bool,
Field(
description="Whether to reset inference weights to the base model when starting from scratch."
),
] = True

# The validation configuration
val: ValConfig | None = None

Expand Down
11 changes: 8 additions & 3 deletions src/prime_rl/orchestrator/orchestrator.py
Original file line number Diff line number Diff line change
Expand Up @@ -234,9 +234,14 @@ async def orchestrate(config: OrchestratorConfig):
lora_name = config.model.lora.name if config.model.lora else None
await inference_pool.update_weights(weights_path, lora_name=lora_name, step=scheduler.ckpt_step)
else:
logger.info("Training from scratch. Resetting weights to base model")
if config.model.lora is None:
await reload_weights(admin_clients)
if config.reload_weights_on_start:
if config.model.lora is None:
logger.info("Training from scratch. Resetting weights to base model")
await reload_weights(admin_clients)
else:
logger.info("Training from scratch. Skipping base weight reload because LoRA is enabled")
else:
logger.info("Training from scratch. Skipping base weight reload")

# Iterate over dataset in batches
max_steps = config.max_steps or int(1e9)
Expand Down