66### Cloud TPU
77
88** TPU Type:** v2.8
9- ** Tensorflow Version:** Nightly
9+ ** Tensorflow Version:** 1.14
1010
1111### Cloud VM
1212
1717Launching Instance and VM
1818---------------------------
1919- Open Google Cloud Shell
20- - ` ctpu up -tf-version nightly `
20+ - ` ctpu up -tf-version 1.14 `
2121- If cloud bucket is not setup automatically, create a cloud storage bucket
2222with the same name as TPU and the VM
2323- enable HTTP traffic for the VM instance
@@ -26,35 +26,6 @@ with the same name as TPU and the VM
2626 - ` pip3 install -r requirements.txt `
2727 - ` export CTPU_NAME=<common name of the tpu, vm and bucket> `
2828
29- Chaning Tensorflow Source Code For Support to Cloud TPU:
30- --------------------------------------------------------
31- TPU is not Officially Supported for Tensorflow 2.0, so it is not exposed in the Public API.
32- However in the code, the python files containing the required modules are imported explicitly.
33- There's a small bug in ` CrossShardOptimizer ` which tries to use OptimizerV1 and all Optimizers
34- available in the Public API are in V2. To support V2 Optimizers, a small Code Fragment is needed
35- to be changed in CrossShardOptimizer's ` apply_gradients(...) ` function.
36- To do that
37- - Browse (` cd ` ) to the installation directory of tensorflow.
38-
39- ** To find the installation directory:**
40- ``` python3
41- >> > import os
42- >> > import tensorflow as tf
43- >> > print (os.path.dirname(str (tf).split(" " )[- 1 ][1 :]))
44- ```
45-
46- - ` cd ` to ` python/tpu ` inside the installation directory
47- - open ` tpu_optimizer.py ` in an editor
48- - change line no. 173 (For Tensorflow 2.0 Beta)
49- ** From**
50- ``` python3
51- return self ._opt.apply_gradients(summed_grads_and_vars, global_step, name)
52- ```
53- ** To**
54- ``` python3
55- return self ._opt.apply_gradients(summed_grads_and_vars, name = name)
56- ```
57- - Save Changes
5829
5930Running Tensorboard:
6031----------------------
@@ -74,11 +45,30 @@ To view Tensorboard, Browse to the Public IP of the VM Instance
7445
7546Running the Code:
7647----------------------
48+ #### Train The Model
49+
7750``` bash
7851$ python3 image_retraining_tpu.py --tpu $CTPU_NAME --use_tpu \
79- --model_dir gs://$CTPU_NAME /model_dir \
80- --data_dir gs://$CTPU_NAME /data_dir \
81- --batch_size 16 \
82- --iterations 4 \
52+ --modeldir gs://$CTPU_NAME /modeldir \
53+ --datadir gs://$CTPU_NAME /datadir \
54+ --logdir gs:// $CTPU_NAME /logdir \
55+ --num_steps 2000 \
8356--dataset horses_or_humans
8457```
58+ Training Saves one single checkpoint at the end of training. This checkpoint can be loaded up
59+ later to export a SavedModel from it.
60+
61+ #### Export Model
62+
63+ ``` bash
64+ $ python3 image_retraining_tpu.py --tpu $CTPU_NAME --use_tpu \
65+ --modeldir gs://$CTPU_NAME /modeldir \
66+ --datadir gs://$CTPU_NAME /datadir \
67+ --logdir gs://$CTPU_NAME /logdir \
68+ --dataset horses_or_humans \
69+ --export_only \
70+ --export_path modeldir/model
71+ ```
72+ Exporting SavedModel of trained model
73+ ----------------------------
74+ The trained model gets saved at ` gs://$CTPU_NAME/modeldir/model ` by default if the path is not explicitly stated using ` --export_path `
0 commit comments