Skip to content

Commit e8dcece

Browse files
Clean up model update group on worker exit (#5325)
Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
1 parent 8e6e062 commit e8dcece

File tree

1 file changed

+9
-0
lines changed

1 file changed

+9
-0
lines changed

trl/experimental/async_grpo/async_rollout_worker.py

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -266,6 +266,15 @@ def _run(self) -> None:
266266
raise
267267
finally:
268268
loop.close()
269+
self._destroy_model_update_group()
270+
271+
def _destroy_model_update_group(self) -> None:
272+
# It's important because otherwise we get errors on exit.
273+
if self.model_update_group is None:
274+
return # happens if weight transfer was never initialized
275+
self.model_update_group.group.store = None
276+
self.model_update_group.group.socket = None
277+
self.model_update_group = None
269278

270279
def pause(self) -> None:
271280
t0 = time.time()

0 commit comments

Comments
 (0)