@@ -622,9 +622,22 @@ service. We will also provide a tutorial for using the tf.data service.
622622* How should we communicate that distributing a dataset will change the order
623623 in which elements are processed? If users' datasets rely on elements being
624624 processed in a certain order, they could face unpleasant surprises.
625- * Should we support splitting `skip` and `take` by having them operate at a
626- per-task level (skip or take the first `N` elements within each task)?
627- * Is there a more user-friendly way to share iteration data across consumers?
625+ - Current plan is to address this through documentation.
626+ * Should we support splitting `skip`, `take`, and `scan` by having them
627+ operate at a per-task level (e.g. skip or take the first `N` elements within
628+ each task)?
629+ - Leaning towards supporting these operations at a per-task level. This is
630+ consistent with how skip/take/scan behave today when using distribution
631+ strategies to distribute a dataset.
632+ * Is there a more user-friendly way to share iteration ids across consumers?
628633 Distribution strategy is well-equipped with collective ops to share the
629- iteration data , but sharing the iteration data could be a heavy burden for
634+ iteration ids , but sharing the iteration id could be a heavy burden for
630635 some users.
636+ - Distributing iteration ids is simple in the common case where a single
637+ process builds the graph. If users are advanced enough to do distributed
638+ training without distribution strategies, they will likely have a
639+ different mechanism available for distributing iteration ids.
640+ * Can `service.distribute` take a `ClusterResolver` so that the master
641+ hostname isn't baked into the dataset definition?
642+ - We can achieve this by having the `distribute` transformation take a
643+ master_address_or_resolver.
0 commit comments