You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Jul 10, 2025. It is now read-only.
Copy file name to clipboardExpand all lines: rfcs/20201121-keras-model-fit-ps.md
+3-3Lines changed: 3 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -186,7 +186,7 @@ For compatibility with other strategies, we propose that `dataset_fn` (which the
186
186
187
187
##### `Model` abstracting the concept of `ClusterCoordinator` for `model.fit`
188
188
189
-
To take advantage of TF2 support of parameter server training, a `ClusterCoordinator` should be created for handling asynchronous function scheduling and joining. The preferred route should be that such an object is abstracted away from the user by `model.fit` training API as an implementation detail. For the power users who would need a `ClusterCoordinator` instance for their custom `schedule`s and `join`s, the `ClusterCoordinator` instance is available as a singleton through a constructor call. See below "`ClusterCoordinator` as a singleton" section for more information.
189
+
To take advantage of TF2 support of parameter server training, a `ClusterCoordinator` should be created for handling asynchronous function scheduling and joining. The preferred route should be that such an object is abstracted away from the user by `model.fit` training API as an implementation detail. For the power users who would need a `ClusterCoordinator` instance for their custom `schedule`s and `join`s, the `ClusterCoordinator` instance is available through a constructor call. See below "`ClusterCoordinator` as a single instance to `Strategy`" section for more information.
190
190
191
191
`ClusterCoordinator` instance can be created at any point prior to `Model`'s use of it, but `model.fit` seems a natural place since that indicates the user's intention for using the compile-fit API as opposed to a CTL, where we expect users to create one.
192
192
@@ -483,7 +483,7 @@ We propose that `ParameterServerStrategy` has an attribute `should_use_with_coor
483
483
self.should_use_with_coordinator = True
484
484
```
485
485
486
-
#### `ClusterCoordinator` as a singleton
486
+
#### `ClusterCoordinator` as a single instance to `Strategy`
487
487
488
488
Since a `ClusterCoordinator` instance spins off worker and failure handling threads, there should only be one `ClusterCoordinator` at any given time with a `strategy` instance, and making it a singleton ensures that those threads are only created once. The singleton is accessible through a constructor call:
489
489
@@ -662,7 +662,7 @@ Tests to verify that training with `model.fit` can withstand worker or PS unavai
662
662
* Design doc (ETA: mid/late-Nov)
663
663
* Schedule design review (ETA: Early Dec)
664
664
* Code check-in with explicit opt-in. (ETA: Early-Mid Dec)
665
-
* User model testing (ETA: Dec)
665
+
* User model testing with opt-in (ETA: Dec)
666
666
* Aligned design with approvals on this doc (ETA: End of Dec)
667
667
* Demonstrable working prototype with checked in test or model (ETA: End of Dec)
0 commit comments