Skip to content
This repository was archived by the owner on Jul 10, 2025. It is now read-only.

Commit 4b55d4e

Browse files
committed
Fix errors and add more description.
1 parent a6e7227 commit 4b55d4e

File tree

1 file changed

+22
-13
lines changed

1 file changed

+22
-13
lines changed

rfcs/20200113-tf-data-service.md

Lines changed: 22 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -122,7 +122,20 @@ compute resources of the TensorFlow cluster for input processing.
122122
### User-facing Python API
123123

124124
This API is how users will interact with the tf.data service from their Python
125-
code.
125+
code. The steps for distributed iteration over a dataset are
126+
127+
1. Create a dataset like usual.
128+
2. Apply the `distribute` transformation to indicate that the dataset should be
129+
processed by the tf.data service.
130+
3. Begin an *iteration* by calling `create_iteration`. An *iteration* is a
131+
single pass through the dataset. Multiple consumers can read from the same
132+
iteration, resulting in each consumer receiving a partition of the original
133+
dataset. We represent an iteration with an iteration id, which is generated
134+
by the tf.data service when you call `create_iteration`.
135+
4. Share the iteration id with all consumer processes which are participating
136+
in the iteration.
137+
5. Create per-consumer iterators using `make_iterator`, and use these iterators
138+
to read data from the tf.data service.
126139

127140
```python
128141
def tf.data.experimental.service.distribute(address):
@@ -158,7 +171,7 @@ def tf.data.experimental.service.create_iteration(
158171
# The iteration object is a byte array which needs to be shared among all
159172
# consumers. Here we suppose there are broadcast_send and broadcast_recv
160173
# methods available.
161-
iteration_id = tf.data.experimental.service.create_iteration(ds, address, 3)
174+
iteration_id = tf.data.experimental.service.create_iteration(ds, 3)
162175
broadcast_send(iteration_id)
163176
else:
164177
iteration_id = broadcast_recv()
@@ -170,10 +183,7 @@ def tf.data.experimental.service.create_iteration(
170183
Args:
171184
dataset: The dataset to begin iteration over.
172185
num_consumers: The number of consumers to divide the dataset between. Set
173-
this if you require determinism. If None, a single iterator id is returned,
174-
and any number of consumers can read from that iterator id. The data
175-
produced by the dataset will be fed to consumers on a first-come
176-
first-served basis.
186+
this if you require determinism.
177187
num_tasks: The number of tasks to use for processing. Tasks run for
178188
the duration of an epoch, and each worker should typically process a single
179189
task. Normally it is best to leave this as None so that the master can
@@ -190,7 +200,7 @@ def tf.data.experimental.service.create_iteration(
190200
"""
191201

192202
def tf.data.experimental.service.make_iterator(
193-
dataset, iteration, consumer_index):
203+
dataset, iteration, consumer_index=0):
194204
"""Creates an iterator for reading from the specified dataset.
195205
196206
Args:
@@ -343,7 +353,7 @@ list<int> CreateIterators(int dataset_id, int num_consumers,
343353
344354
// Returns the list of tasks processing data for `iterator_id`. Consumers query
345355
// this to find which worker addresses to read data from.
346-
list<TaskInfo> GetWorkersForiterator(int iterator_id);
356+
list<TaskInfo> GetWorkersForIterator(int iterator_id);
347357
348358
///---- Methods called by input workers ----
349359
@@ -376,7 +386,7 @@ list<Tensors> GetElement(iterator_id);
376386
void ProcessDataset(int dataset_id, int iteration_id, list<int> iterator_ids);
377387
```
378388
379-
#### Visitation Guarantee
389+
#### Visitation Guarantees
380390
381391
When iterating over a deterministic dataset, the tf.data service will process
382392
all input data exactly once, even in the presence of master or worker failures.
@@ -406,10 +416,9 @@ service will provide determinism.
406416
To get deterministic behavior, the tf.data service will require three things:
407417
408418
1. The dataset being distributed has deterministic output.
409-
1. The user sets `deterministic=True` when calling
410-
`tf.data.experimental.service.create_iteration`.
411-
1. The user specifies how many input tasks to use when calling
412-
`tf.data.experimental.service.create_iteration`.
419+
1. The user sets `num_consumers`, `num_tasks`, and `deterministic=True` when
420+
calling `tf.data.experimental.service.create_iteration`.
421+
1. Each consumer uses a unique `consumer_index` when calling `make_iterator`.
413422
1. The consumers do not fail.
414423
415424
In the absence of failures, determinism is achieved by distributing splits

0 commit comments

Comments
 (0)