You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Jul 10, 2025. It is now read-only.
@@ -16,7 +16,7 @@ There are several mission critical applications in medicine, finance and automat
16
16
17
17
The lack of determinism in certain ops prevents companies from launching products using models developed in TF. For a subset of these industries having deterministic behavior is a regulatory requirement.
18
18
19
-
In addition, deterministic ops increases model velocity development by reducing noise, while also simplifying the debugging workflow.
19
+
In addition, deterministic functionality, enabled by deterministic ops, increases model velocity development by reducing noise, while also simplifying the debugging workflow.
20
20
21
21
## Design Proposal
22
22
@@ -27,7 +27,7 @@ We will create a new flag with the default value of "False" which enables determ
27
27
28
28
The first function takes in a boolean value, and allows the model developer to enable/disable deterministic ops. The second function returns a bool indicating whether deterministic ops is enabled.
29
29
30
-
Once enabled, every built-in op will either be made deterministic or raise an error if determinism is not supported. For ops which we have not yet implemented a deterministic version, a `NotImplementedError` will be raised. In the long term, we plan on adding a deterministic version to all such ops. For ops which are inherently nondeterministic such as `tf.random.normal` without a seed, a `FailedPreconditionError` will be raised (the precondition being that determinism must be disabled). Certain ops will only raise an error for certain input shapesor attributes. Depending on the op, in graph mode, the error will either be raised when the op is constructed or when the op is run.
30
+
Once enabled, every built-in op will either be made deterministic or raise an error if determinism is not supported. A `tf.errors.UnimplementedError` will be raised by ops for which we have not yet implemented a deterministic version. In the long term, we plan on adding a deterministic version to all such ops. For ops which are inherently nondeterministic such as `tf.random.normal` without a seed, a `tf.errors.FailedPreconditionError` will be raised (the precondition being that determinism must be disabled). Some ops will only raise an error on a subset of input shapes, attributes, data types, or codepaths through the op. Depending on the op, in graph mode, the error will either be raised when the op is constructed or when the op is run.
31
31
32
32
By "deterministic", we mean that if an op is run multiple times with the same inputs and attributes, it produces the same outputs. The op must be run with the same hardware configuration on the same device each time. The software environment must be the same every run as well (OS, TF and CUDA version, environmental variables, etc). For stateful ops, the all relevant state must be identical each run (values of `tf.Variable`s, checkpoints, etc).
33
33
@@ -38,8 +38,8 @@ This API only makes ops deterministic, not other parts of TensorFlow. For exampl
38
38
The API allows users to write deterministic models. To do so, users must:
39
39
40
40
* Enable deterministic ops with `tf.config.enable_deterministic_ops`.
41
-
* Use same hardware configuration in every run.
42
-
* Use the same software environment every run (OS, checkpoints, version of CUDA and TF, environmental variables, etc).
41
+
* Use the same hardware configuration in every run.
42
+
* Use the same software environment in every run (OS, checkpoints, version of CUDA and TF, environmental variables, etc).
43
43
* Not use nondeterministic parts of TensorFlow (besides ops), such as `ParameterServerStrategy`.
44
44
* Not use constructs outside TensorFlow that are nondeterministic, such as Python’s `random` module (without a fixed seed) or using multiple threads/processes in ways that influence TensorFlow’s behavior.
45
45
* Not use nondeterministic custom ops.
@@ -77,7 +77,7 @@ It is also possible Grappler is nondeterministic due to nondeterministic iterati
77
77
78
78
### Random ops
79
79
80
-
Legacy random ops, such as `tf.random.normal`, are not deterministic if no seed is set, and so such ops will raise a `FailedPreconditionError` when determinism is enabled. To fix, the user should set a global seed with `tf.random.set_seed`. Since most models use legacy random ops (for variable initialization and various other uses), in practice users must call `tf.random.set_seed` when enabling deterministic ops. Alternatively, users can pass a seed to every individual random operation, but doing so is more inconvenient.
80
+
Legacy random ops, such as `tf.random.normal`, are not deterministic if no seed is set, and so such ops will raise a `tf.errors.FailedPreconditionError` when determinism is enabled. To fix, the user should set a global seed with `tf.random.set_seed`. Since most models use legacy random ops (for variable initialization and various other uses), in practice users must call `tf.random.set_seed` when enabling deterministic ops. Alternatively, users can pass a seed to every individual random operation, but doing so is more inconvenient.
81
81
82
82
Certain random ops, such as `tf.image.sample_distorted_bounding_box` and `tf.nn.fractional_max_pool`, ignore the global seed if a seed is not explicitly passed. For such ops, setting the global seed is not enough to avoid the error, so users must pass a seed directly to the op.
83
83
@@ -95,7 +95,7 @@ We must ensure that every op will either run deterministically or raise an error
95
95
96
96
2. We will add a special mode to TensorFlow where every time a non-stateful op is run, TensorFlow will rerun the op several times and assert the outputs are the same each time. We will then run the TensorFlow unit tests with this mode as part of the nightly tests. Doing so ensures that for each op that is run as part of a unit test, it will be tested for determinism.
97
97
98
-
3. When adding determinism to an op which previously was nondeterministic, an explicit unit test will be added that checks for determinism. This is slightly redundant with the special mode described above, but the explicit unit test can be part of the presubmit tests instead of the nightly tests, and can test on inputs that are very likely to demonstrate nondeterminism if it exists.
98
+
3. When adding determinism to an op which previously was nondeterministic, an explicit unit test will be added that checks for determinism. Unlike running unit tests with the special mode above, the explicit unit tests can be part of the presubmit tests instead of the nightly tests, and can test on inputs that are very likely to demonstrate nondeterminism if it exists.
0 commit comments