You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Jul 10, 2025. It is now read-only.
Copy file name to clipboardExpand all lines: rfcs/20200712-tfrt-kernel-fallback.md
+13-17Lines changed: 13 additions & 17 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -14,8 +14,7 @@ mobile devices by removing the need to execute them via the TensorFlow eager run
14
14
calling kernels directly from the new [TFRT](https://github.com/tensorflow/runtime) TensorFlow runtime.
15
15
16
16
Note that there is an effort to call existing kernels by delegating to
17
-
TensorFlow eager runtime instead. This approach is called Runtime Fallback and
18
-
corresponding RFC will be published soon. The goals of the two fallback
17
+
TensorFlow eager runtime instead. This approach is called Runtime Fallback. The goals of the two fallback
19
18
mechanisms are as follows:
20
19
21
20
* Runtime Fallback aims to reuse all current TensorFlow kernels in TFRT.
@@ -39,11 +38,13 @@ calls TensorFlow kernels without going through Eager runtime first. We plan to
39
38
address the second high level goal by trimming down dependencies, switching to
40
39
more compact proto representation, etc.
41
40
41
+
Note that TensorFlow's current mobile solution is called [TensorFlow Lite](https://www.tensorflow.org/lite). At the same time, there is a work-in-progress effort to enable [TFRT](https://github.com/tensorflow/runtime) to run on mobile. This document focuses on the way TFRT would call kernels when running on mobile devices. Details of the way TFRT itself would be executed on mobile platforms are outside of the scope of this document.
42
+
43
+
42
44
### Op Coverage Goals
43
45
44
46
First of all, we plan to target all the easier-to-support ops that don’t require
45
-
implementing extensive pieces of infrastructure, but at the same time provide
46
-
the most value to the TF Lite team.
47
+
implementing extensive pieces of infrastructure.
47
48
48
49
We analysed how many kernels we can support in the future and include our
49
50
findings in the following spreadsheets. As we describe in
@@ -90,8 +91,7 @@ custom
90
91
extra effort required.
91
92
* Gradients would not be supported by the first iteration of Kernel Fallback,
92
93
but we might revisit it later.
93
-
* Exact details of TFRT integration are still being worked out by TFRT and TF
94
-
Lite teams. Since these teams might change the plan, exact details are not a
94
+
* Exact details of TFRT integration are still being worked out by TFRT and TensorFlow mobile teams. Since these teams might change the plan, exact details are not a
95
95
part of this doc. The take away is that we will integrate kernel fallback
96
96
following the approach they decide on.
97
97
@@ -104,7 +104,7 @@ pool of available ops on mobile devices, ideally supporting everything that full
104
104
TensorFlow supports now.
105
105
106
106
However, supporting TensorFlow ops on mobile devices presents some challenges.
107
-
Specifically, binary size on mobile platforms should be restricted. TF Lite team
107
+
Specifically, binary size on mobile platforms should be restricted. TensorFlow mobile team
108
108
provided us with the following *ideal* numbers:
109
109
110
110
* 100-200k overhead to call TF kernels
@@ -247,12 +247,11 @@ and templating approaches. Key findings are summarized below:
247
247
estimated at 2.6% (based on adding `AddN` op).
248
248
249
249
Right now, we are leaning towards using inheritance. Seems like time increase is
250
-
only significant for running many scalar ops in a sequence - probably a rare use
251
-
case in the real world. (See more details in [Appendix 2](#appendix-2-extension-options))
250
+
only not significant. (See more details in [Appendix 2](#appendix-2-extension-options))
252
251
253
252
To use inheritance, we will define `OpKernelConstructionInterface` and
254
253
`OpKernelContextInterface` interfaces. Ideally, these interfaces should be pure
255
-
virtual. However, we will have one exception - templated `eigen_device` method
254
+
virtual. However, we will have some exception - for e.g. templated `eigen_device` method
256
255
that calls per-device pure-virtual implementations.
257
256
258
257
We will then introduce `TFRTOpKernelConstruction` and `TFRTOpKernelContext`
@@ -261,7 +260,7 @@ subclasses that implement `OpKernelConstructionInterface` and
0 commit comments