Skip to content
This repository was archived by the owner on Jul 10, 2025. It is now read-only.

Commit 474d984

Browse files
committed
Update: 2019-11-13 12:10 pm
1 parent b83d545 commit 474d984

File tree

1 file changed

+14
-9
lines changed

1 file changed

+14
-9
lines changed

rfcs/20191106-tf2-tpu-savedmodel.md

Lines changed: 14 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -2,10 +2,10 @@
22

33
| Status | Proposed |
44
| :------------ | :------------------------------------------------------ |
5-
| **RFC #** | [171](https://github.com/tensorflow/community/pull/171) |
5+
| **RFC #** | [NNN](https://github.com/tensorflow/community/pull/NNN) |
66
: : (update when you have community PR #) :
7-
| **Author(s)** | Zhuoran Liu (lzr@google.com), Youlong Cheng (ylc@google.com) |
8-
| **Sponsor** | Jonathan Hseu ([email protected]) |
7+
| **Author(s)** | ylc@google.com, lzr@google.com |
8+
| **Sponsor** | [email protected] |
99
| **Updated** | 2019-11-06 |
1010

1111
## Objective
@@ -46,16 +46,16 @@ Some major differences between CPU and TPU Graph:
4646
VarHandleOp, and consumed by ReadVariableOp.
4747

4848
Also for reducing the number of TPU compilation, serving platforms(For example,
49-
[TensorFlow Serving](https://www.tensorflow.org/tfx/guide/serving)) prefers batching the inference requests with a few allowed batch
49+
Servomatic) prefers batching the inference requests with a few allowed batch
5050
sizes. This requires wrapping TPUPartitionedCall in another function, and called
5151
by BatchFunction.
5252

5353
Below is an intuitive example of how a TPU graph is different from a CPU one:
5454

55-
![Original CPU Graph](20191106-tf2-tpu-savedmodel/cpu_graph.png)
55+
![Original CPU Graph](https://cs.corp.google.com/codesearch/f/piper///depot/google3/experimental/users/lzr/tf2-tpu-rfcs/tf2-tpu-savedmodel/cpu_graph.png)
5656
<center>Original CPU Graph.</center>
5757

58-
![TPU Graph](20191106-tf2-tpu-savedmodel/tpu_graph.png)
58+
![TPU Graph](https://cs.corp.google.com/codesearch/f/piper///depot/google3/experimental/users/lzr/tf2-tpu-rfcs/tf2-tpu-savedmodel/tpu_graph.png)
5959
<center>TPU Graph.</center>
6060

6161
### User Control of Device Placement
@@ -66,7 +66,7 @@ for every use case. For example even though dense embedding ops are allowed on
6666
TPU, serving models might still want to run embedding lookups on CPU because the
6767
embeddings are too big to fit on TPU.
6868

69-
![Customized Embeddings](20191106-tf2-tpu-savedmodel/customized_embeddings.png)
69+
![Customized Embeddings](https://cs.corp.google.com/codesearch/f/piper///depot/google3/experimental/users/lzr/tf2-tpu-rfcs/tf2-tpu-savedmodel/customized_embeddings.png)
7070
<center>Example of user control. In this graph, both ‘custom_embedding’ and
7171
‘dense’ can run on TPU. But users want ‘custom_embedding’ to run on CPU for
7272
whatever reason, e.g. CPU computations can be parallelized, users don’t have
@@ -75,7 +75,8 @@ SavedModel that only ‘dense’ is to run on TPU.</center>
7575

7676
## User Benefit
7777

78-
Enable TPU Inference.
78+
<!-- TODO(lzr) How will users (or other contributors) benefit from this work? What would be the
79+
headline in the release notes or blog post? -->
7980

8081
## Design Proposal
8182

@@ -127,7 +128,7 @@ Users need to do the following things to export a TPU SavedModel in TF2.x:
127128

128129
The resulting TPU inference graph looks like this:
129130

130-
![Resulting TPU Graph](20191106-tf2-tpu-savedmodel/tpu_result.png)
131+
![Resulting TPU Graph](https://cs.corp.google.com/codesearch/f/piper///depot/google3/experimental/users/lzr/tf2-tpu-rfcs/tf2-tpu-savedmodel/tpu_result.png)
131132
<center>Resulting TPU Graph.</center>
132133

133134
<b>For Advanced Users who need customized Ops</b>
@@ -331,3 +332,7 @@ def save_model(model,
331332
tags,
332333
options)
333334
```
335+
336+
## Questions and Discussion Topics
337+
338+
<!-- TODO(lzr): Seed this with open questions you require feedback on from the RFC process. -->

0 commit comments

Comments
 (0)