|
2 | 2 |
|
3 | 3 | The `Enable Auto-Mixed Precision for Transfer Learning with TensorFlow*` sample guides you through the process of enabling auto-mixed precision to use low-precision datatypes, like bfloat16, for transfer learning with TensorFlow* (TF).
|
4 | 4 |
|
5 |
| -The sample demonstrates the end-to-end pipeline tasks typically performed in a deep learning use-case: training (and retraining), inference optimization, and serving the model with TensorFlow Serving. |
| 5 | +The sample demonstrates the tasks typically performed in a deep learning use-case: training (and retraining), and inference optimization. The sample also includes tips and boilerplate code for serving the model with TensorFlow Serving. |
6 | 6 |
|
7 | 7 | | Area | Description
|
8 | 8 | |:--- |:---
|
@@ -37,10 +37,6 @@ You will need to download and install the following toolkits, tools, and compone
|
37 | 37 |
|
38 | 38 | Install using PIP: `$pip install notebook`. <br> Alternatively, see [*Installing Jupyter*](https://jupyter.org/install) for detailed installation instructions.
|
39 | 39 |
|
40 |
| -- **TensorFlow Serving** |
41 |
| - |
42 |
| - See *TensorFlow Serving* [*Installation*](https://www.tensorflow.org/tfx/serving/setup) for detailed installation options. |
43 |
| - |
44 | 40 | - **Other dependencies**
|
45 | 41 |
|
46 | 42 | Install using PIP and the `requirements.txt` file supplied with the sample: `$pip install -r requirements.txt --no-deps`. <br> The `requirements.txt` file contains the necessary dependencies to run the Notebook.
|
@@ -112,6 +108,70 @@ You will see diagrams comparing performance and analysis. This includes performa
|
112 | 108 |
|
113 | 109 | For performance analysis, you will see histograms showing different Tensorflow* operations in the analyzed pre-trained model pb file.
|
114 | 110 |
|
| 111 | +## Serve the model with TensorFlow Serving |
| 112 | +
|
| 113 | +### Installation |
| 114 | +See *TensorFlow Serving* [*Installation*](https://www.tensorflow.org/tfx/serving/setup) for detailed installation options. |
| 115 | +
|
| 116 | +### Example Code |
| 117 | +
|
| 118 | +Create a copy of the optimized model in a well-defined directory hierarchy with a version number "1". |
| 119 | +
|
| 120 | +``` |
| 121 | +!mkdir serving |
| 122 | +!cp -r models/my_optimized_model serving/1 |
| 123 | +``` |
| 124 | +
|
| 125 | +``` |
| 126 | +os.environ["MODEL_DIR"] = os.getcwd() + "/serving" |
| 127 | +``` |
| 128 | +
|
| 129 | +This is where we start running TensorFlow Serving and load our model. After it loads we can start making inference requests using REST. There are some important parameters: |
| 130 | +- **rest_api_port**: The port that you'll use for REST requests. |
| 131 | +- **model_name**: You'll use this in the URL of REST requests. It can be anything. |
| 132 | +- **model_base_path**: This is the path to the directory where you've saved your model. |
| 133 | +
|
| 134 | +``` |
| 135 | +%%bash --bg |
| 136 | +nohup tensorflow_model_server --rest_api_port=8501 --model_name=rn50 --model_base_path=${MODEL_DIR} > server.log 2>&1 |
| 137 | +``` |
| 138 | +
|
| 139 | +#### Prepare the testing data for prediction |
| 140 | +
|
| 141 | +``` |
| 142 | +for image_batch, labels_batch in val_ds: |
| 143 | + print(image_batch.shape) |
| 144 | + print(labels_batch.shape) |
| 145 | + break |
| 146 | +test_data, test_labels = image_batch.numpy(), labels_batch.numpy() |
| 147 | +``` |
| 148 | +
|
| 149 | +#### Make REST requests |
| 150 | +
|
| 151 | +Now let's create the JSON object for a batch of three inference requests and we'll send a predict request as a POST to our server's REST endpoint, and pass it three examples. |
| 152 | +
|
| 153 | +``` |
| 154 | +import json |
| 155 | +import matplotlib.pyplot as plt |
| 156 | +
|
| 157 | +def show(idx, title): |
| 158 | + plt.figure() |
| 159 | + plt.imshow(test_data[idx]) |
| 160 | + plt.axis('off') |
| 161 | + plt.title('\n\n{}'.format(title), fontdict={'size': 16}) |
| 162 | +
|
| 163 | +data = json.dumps({"signature_name": "serving_default", "instances": test_data[0:3].tolist()}) |
| 164 | +print('Data: {} ... {}'.format(data[:50], data[len(data)-52:])) |
| 165 | +
|
| 166 | +headers = {"content-type": "application/json"} |
| 167 | +json_response = requests.post('http://localhost:8501/v1/models/rn50:predict', data=data, headers=headers) |
| 168 | +predictions = json.loads(json_response.text)['predictions'] |
| 169 | +
|
| 170 | +for i in range(0,3): |
| 171 | + show(i, 'The model thought this was a {} (class {}), and it was actually a {} (class {})'.format( |
| 172 | + class_names[np.argmax(predictions[i])], np.argmax(predictions[i]), class_names[test_labels[i]], test_labels[i])) |
| 173 | +``` |
| 174 | +
|
115 | 175 | ## License
|
116 | 176 |
|
117 | 177 | Code samples are licensed under the MIT license. See
|
|
0 commit comments