@@ -122,7 +122,20 @@ compute resources of the TensorFlow cluster for input processing.
122122### User-facing Python API
123123
124124This API is how users will interact with the tf.data service from their Python
125- code.
125+ code. The steps for distributed iteration over a dataset are
126+
127+ 1 . Create a dataset like usual.
128+ 2 . Apply the ` distribute ` transformation to indicate that the dataset should be
129+ processed by the tf.data service.
130+ 3 . Begin an * iteration* by calling ` create_iteration ` . An * iteration* is a
131+ single pass through the dataset. Multiple consumers can read from the same
132+ iteration, resulting in each consumer receiving a partition of the original
133+ dataset. We represent an iteration with an iteration id, which is generated
134+ by the tf.data service when you call ` create_iteration ` .
135+ 4 . Share the iteration id with all consumer processes which are participating
136+ in the iteration.
137+ 5 . Create per-consumer iterators using ` make_iterator ` , and use these iterators
138+ to read data from the tf.data service.
126139
127140``` python
128141def tf.data.experimental.service.distribute(address):
@@ -158,7 +171,7 @@ def tf.data.experimental.service.create_iteration(
158171 # The iteration object is a byte array which needs to be shared among all
159172 # consumers. Here we suppose there are broadcast_send and broadcast_recv
160173 # methods available.
161- iteration_id = tf.data.experimental.service.create_iteration(ds, address, 3)
174+ iteration_id = tf.data.experimental.service.create_iteration(ds, 3)
162175 broadcast_send(iteration_id)
163176 else:
164177 iteration_id = broadcast_recv()
@@ -170,10 +183,7 @@ def tf.data.experimental.service.create_iteration(
170183 Args:
171184 dataset: The dataset to begin iteration over.
172185 num_consumers: The number of consumers to divide the dataset between. Set
173- this if you require determinism. If None, a single iterator id is returned,
174- and any number of consumers can read from that iterator id. The data
175- produced by the dataset will be fed to consumers on a first-come
176- first-served basis.
186+ this if you require determinism.
177187 num_tasks: The number of tasks to use for processing. Tasks run for
178188 the duration of an epoch, and each worker should typically process a single
179189 task. Normally it is best to leave this as None so that the master can
@@ -190,7 +200,7 @@ def tf.data.experimental.service.create_iteration(
190200 """
191201
192202def tf.data.experimental.service.make_iterator(
193- dataset, iteration, consumer_index):
203+ dataset, iteration, consumer_index = 0 ):
194204 """ Creates an iterator for reading from the specified dataset.
195205
196206 Args:
@@ -343,7 +353,7 @@ list<int> CreateIterators(int dataset_id, int num_consumers,
343353
344354// Returns the list of tasks processing data for `iterator_id`. Consumers query
345355// this to find which worker addresses to read data from.
346- list<TaskInfo> GetWorkersForiterator (int iterator_id);
356+ list<TaskInfo> GetWorkersForIterator (int iterator_id);
347357
348358///---- Methods called by input workers ----
349359
@@ -376,7 +386,7 @@ list<Tensors> GetElement(iterator_id);
376386void ProcessDataset(int dataset_id, int iteration_id, list<int > iterator_ids);
377387```
378388
379- #### Visitation Guarantee
389+ #### Visitation Guarantees
380390
381391When iterating over a deterministic dataset, the tf.data service will process
382392all input data exactly once, even in the presence of master or worker failures.
@@ -406,10 +416,9 @@ service will provide determinism.
406416To get deterministic behavior, the tf.data service will require three things:
407417
4084181. The dataset being distributed has deterministic output.
409- 1. The user sets `deterministic=True` when calling
410- `tf.data.experimental.service.create_iteration`.
411- 1. The user specifies how many input tasks to use when calling
412- `tf.data.experimental.service.create_iteration`.
419+ 1. The user sets `num_consumers`, `num_tasks`, and `deterministic=True` when
420+ calling `tf.data.experimental.service.create_iteration`.
421+ 1. Each consumer uses a unique `consumer_index` when calling `make_iterator`.
4134221. The consumers do not fail.
414423
415424In the absence of failures, determinism is achieved by distributing splits
0 commit comments