Skip to content

Conversation

@shantanugupta2004
Copy link
Contributor

@shantanugupta2004 shantanugupta2004 commented Dec 14, 2025

What does this PR do?

This PR addresses an issue where, when using accelerate with Weights & Biases (W&B) in offline mode, duplicate WB runs were being initialized. This resulted in two "offline-run-..." directories being created for a single training process, which is an unintended and redundant behavior.
The problem stemmed from the WandBTracker.store_init_configuration method. In offline mode, this method would explicitly call wandb.init() again to include the run's configuration, even though wandb.init() had already be called by WandBTracker.start(). This redundant initialization led to the creation of a new W&B run, effectively duplicating the logging process.
This PR resolves the issue by removing the second, offline-mode-specific wandb.init() call within WandBTracker.store_init_configuration. Instead, it now consistently uses wandb.config.update(values, allow_val_change=True) to update the run's configuration. This approach correctly integrates the configuration in the existing W&B run without triggering a new initialization. The fix ensures that only a single W&B run is initialized and maintained throughout the training process when operating in offline mode, leading to cleaner W&B directories and more accurate run management.

Fixes #3818

Before submitting

Copy link
Member

@SunMarc SunMarc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, let's revert for now

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@SunMarc SunMarc merged commit 16b6b3f into huggingface:main Dec 16, 2025
25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Duplicate W&B initialization in offline mode

3 participants