|
| 1 | +# Concurrency Semantics For TensorFlow Resource Variables |
| 2 | + |
| 3 | +| Status | Implemented | |
| 4 | +:-------------- |:---------------------------------------------------- | |
| 5 | +| **Author(s) ** | Sanjoy Das ( [email protected]) | |
| 6 | +| **Sponsor ** | Alexandre Passos ( [email protected]) | |
| 7 | +| **Updated** | 2019-06-10 | |
| 8 | + |
| 9 | +This document aims to assign semantics to _concurrent accesses_ to resource variables in TensorFlow. _Concurrent accesses_ refer to resource variable loads and stores that are not totally ordered by control or data dependencies. |
| 10 | + |
| 11 | +The semantics presented in this doc will probably be unsurprising to people familiar with TensorFlow. However a concrete written down semantics is a useful foundation for writing transformations and analyses over TensorFlow graphs. |
| 12 | + |
| 13 | +We do not discuss ref variables, which are deprecated. |
| 14 | + |
| 15 | +## Semantics |
| 16 | + |
| 17 | +We define an _operation instance_ as a `(node, frame)` tuple, where `node` is a node in the TensorFlow graph and `frame` is a frame (see [Dynamic control flow in large-scale machine learning](https://dl.acm.org/citation.cfm?id=3190551)) in which the node was executed. |
| 18 | + |
| 19 | +The operation instances in an execution of a graph are put in a total order such that: |
| 20 | + |
| 21 | +1. The total order is consistent with the partial order imposed by the control and data dependencies within the graph. That is, if `(n1, f1)` depends on `(n0, f0)` then in the total order `(n1, f1)` will be ordered after `(n0, f0)`. |
| 22 | +1. The total order respects critical sections. That is, if two nodes, `n0` and `n1` are in the same critical section, then the following total order is illegal: [(`n0`, `f0`), (`n0`, `f1`), (`n1`, `f0`), (`n1`, `f1`)]. On the other hand, the following total order is fine: [(`n0`, `f0`), (`n1`, `f0`), (`n0`, `f1`), (`n1`, `f1`)] |
| 23 | +1. If there are `N` graphs running in the same [TensorFlow session](https://www.tensorflow.org/api_docs/python/tf/Session), with source nodes `sub_source_i` and sink nodes `sub_sink_i` (with `i` in `[0, N)`) then the total order is consistent with a “supergraph” with a source node that has control edges to all `sub_source_i` and a sink node that has control edges from all `sub_sink_i`. |
| 24 | +1. The total order is consistent with blocking semantics of TensorFlow ops. For example if an instance of a `QueueDequeueV2` can complete only after a matching `QueueEnqueueV2` has completed then the total order cannot put the `QueueDequeueV2` instance before the `QueueEnqueueV2` instance. |
| 25 | + |
| 26 | +Resource variable reads and writes follow sequential semantics as if they were executed in this total order. The writes and reads themselves are atomic -- a read never sees a partial write, even when it is racing with the write. |
| 27 | + |
| 28 | +This total order is not fixed run to run and can vary depending on various external factors like operating system thread scheduling, hardware behavior etc. |
| 29 | + |
| 30 | +Operations that read and write resource variables at the same time, like `ResourceApplyGradientDescent`, are modeled by decomposing them into a read, compute and write sequence which does not execute atomically as a unit. This behavior is weaker than what the executor implements and this weakening simplifies XLA’s auto-clustering. |
| 31 | + |
| 32 | + |
| 33 | +## Examples & Consequences |
| 34 | + |
| 35 | +### Simplest Non-Trivial Example |
| 36 | + |
| 37 | + |
| 38 | + |
| 39 | +The graph above does two resource writes and two resource reads. Assuming that `X` and `Y` are `0` to start with, control dependencies on the graph guarantee that _if_ `r0` is `2` _then_ `r1` is `1`. That is, the result `r0 == 2 && r1 == 0` is disallowed. |
| 40 | + |
| 41 | +Note that this means stores that are constrained to run in a specific order via control or data dependencies cannot be reordered. |
| 42 | + |
| 43 | + |
| 44 | +### Resource Update Operations |
| 45 | + |
| 46 | +Resource update operations, like `ResourceApplyGradientDescent`, are modeled as a read, compute, write sequence. For simplicity we invent a resource update operation, `*X+=1`, that increments the tensor stored in the resource variable `X` by `1`. With this framing, two concurrent `*X+=1` operations can be modeled as the graph below. All total orders that are consistent with the partial order in the graph are valid. |
| 47 | + |
| 48 | + |
| 49 | + |
| 50 | +This behavior is weaker than the memory model actually implemented by the graph executor. Since `ResourceApplyGradientDescent` et. al. [lock out](https://github.com/tensorflow/tensorflow/blob/a53365719e445edc5b48f0877f1d85b8d5837384/tensorflow/core/kernels/training_ops.cc#L556) the resource variable being updated for the entire duration of their execution the graph executor guarantees that the two increments happen atomically and `*X` is always `2` after the graph execution. |
| 51 | + |
| 52 | + |
| 53 | +### Load/Store Reordering |
| 54 | + |
| 55 | + |
| 56 | + |
| 57 | +Just like two stores ordered via control dependence cannot be reordered, a load and a store ordered via control edges cannot be reordered either. For instance, given the semantics in this document, when the execution of graph above ends if `X` is `1` then `r0` must be `5`. This invariant breaks if we allow the `r0 = *Y` to happen before `*X = 1`. |
| 58 | + |
| 59 | +This invariant is relevant for XLA; it means wrapping an arbitrary TensorFlow graph in an `_XlaCompile` / `_XlaRun` sequence or an `XlaLaunch` op is generally incorrect -- these XLA ops hoist all the resource variable reads to the beginning of the computation and sink all the writes to the end of the computation which may violate load/store ordering present in the TensorFlow graph. |
0 commit comments