Skip to content
This repository was archived by the owner on Jul 10, 2025. It is now read-only.

Commit 4c005e7

Browse files
committed
Update discussion topics
1 parent afec5f9 commit 4c005e7

File tree

1 file changed

+17
-4
lines changed

1 file changed

+17
-4
lines changed

rfcs/20200113-tf-data-service.md

Lines changed: 17 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -622,9 +622,22 @@ service. We will also provide a tutorial for using the tf.data service.
622622
* How should we communicate that distributing a dataset will change the order
623623
in which elements are processed? If users' datasets rely on elements being
624624
processed in a certain order, they could face unpleasant surprises.
625-
* Should we support splitting `skip` and `take` by having them operate at a
626-
per-task level (skip or take the first `N` elements within each task)?
627-
* Is there a more user-friendly way to share iteration data across consumers?
625+
- Current plan is to address this through documentation.
626+
* Should we support splitting `skip`, `take`, and `scan` by having them
627+
operate at a per-task level (e.g. skip or take the first `N` elements within
628+
each task)?
629+
- Leaning towards supporting these operations at a per-task level. This is
630+
consistent with how skip/take/scan behave today when using distribution
631+
strategies to distribute a dataset.
632+
* Is there a more user-friendly way to share iteration ids across consumers?
628633
Distribution strategy is well-equipped with collective ops to share the
629-
iteration data, but sharing the iteration data could be a heavy burden for
634+
iteration ids, but sharing the iteration id could be a heavy burden for
630635
some users.
636+
- Distributing iteration ids is simple in the common case where a single
637+
process builds the graph. If users are advanced enough to do distributed
638+
training without distribution strategies, they will likely have a
639+
different mechanism available for distributing iteration ids.
640+
* Can `service.distribute` take a `ClusterResolver` so that the master
641+
hostname isn't baked into the dataset definition?
642+
- We can achieve this by having the `distribute` transformation take a
643+
master_address_or_resolver.

0 commit comments

Comments
 (0)