Skip to content

Commit d7daed9

Browse files
SinaChavoshiTensorflow Cloud maintainers
authored andcommitted
Update documentation and examples.
PiperOrigin-RevId: 360592145
1 parent f4a5651 commit d7daed9

5 files changed

+182
-163
lines changed

examples/README.md

Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
# TensorFlow Cloud Examples
2+
3+
If you have not set up a Google Cloud Project yet start by running
4+
[Google Cloud Project Setup Instructions](https://github.com/tensorflow/cloud/blob/master/examples/google_cloud_project_setup_instructions.ipynb).
5+
This guide will help you setup a Google Cloud Project, and configure it for
6+
running tensorflow-cloud projects. Once you have set up your Google Cloud
7+
Project move on to one of the samples below.
8+
9+
* **[Google Cloud Project Setup Instructions](https://github.com/tensorflow/cloud/blob/master/examples/google_cloud_project_setup_instructions.ipynb)**
10+
11+
This guide is to help first time users set up a Google Cloud Platform
12+
account specifically with the intention to use
13+
[tensorflow_cloud](https://github.com/tensorflow/cloud) to easily run
14+
training at scale on Google Cloud AI Platform.
15+
[tensorflow_cloud](https://github.com/tensorflow/cloud) provides APIs that
16+
allow users to easily go from debugging, training, tuning Keras and
17+
TensorFlow code in a local or kaggle environment to distributed
18+
training/tuning on Cloud.
19+
20+
* **[Distributed training NasNet with tensorflow_cloud and Google Cloud](https://github.com/tensorflow/cloud/blob/master/examples/distributed_training_nasnet_with_tensorflow_cloud.ipynb)**
21+
22+
This example is based on
23+
[Image classification via fine-tuning with EfficientNet](https://keras.io/examples/vision/image_classification_efficientnet_fine_tuning/)
24+
to demonstrate how to train a
25+
[NasNetMobile](https://keras.io/api/applications/nasnet/#nasnetmobile-function)
26+
model using [tensorflow_cloud](https://github.com/tensorflow/cloud) and
27+
Google Cloud Platform at scale using distributed training.
28+
29+
* **[HP Tuning CIFAR10 on Google Cloud with tensorflow_cloud and CloudTuner](https://github.com/tensorflow/cloud/blob/master/examples/hp_tuning_cifar10_using_google_cloud.ipynb)**
30+
31+
This example is based on
32+
[Keras-Tuner CIFAR10 sample](https://github.com/keras-team/keras-tuner/blob/master/examples/cifar10.py)
33+
to demonstrate how to run HP tuning jobs using
34+
[tensorflow_cloud](https://github.com/tensorflow/cloud) and Google Cloud
35+
Platform at scale.
36+
37+
* **[Tuning a wide and Deep model using Google Cloud](https://github.com/tensorflow/cloud/blob/master/examples/hp_tuning_wide_and_deep_model.ipynb)**
38+
39+
In this example we will use CloudTuner and Google Cloud to Tune a
40+
[Wide and Deep Model](https://ai.googleblog.com/2016/06/wide-deep-learning-better-together-with.html)
41+
based on the tunable model introduced in
42+
[structured data learning with Wide, Deep, and Cross networks](https://keras.io/examples/structured_data/wide_deep_cross_networks/).
43+
In this example we will use the data set from
44+
[CAIIS Dogfood Day](https://www.kaggle.com/c/caiis-dogfood-day-2020/overview)

examples/distributed_training_nasnet_with_tensorflow_cloud.ipynb

Lines changed: 24 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -81,7 +81,7 @@
8181
"# DO NOT CHANGE: Currently only the 'us-central1' region is supported.\n",
8282
"REGION = 'us-central1'\n",
8383
"\n",
84-
"# OPTIONAL: You can change the project name to any string.\n",
84+
"# OPTIONAL: You can change the job name to any string.\n",
8585
"JOB_NAME = 'nasnet' #@param {type:\"string\"}\n",
8686
"\n",
8787
"# Setting location were training logs and checkpoints will be stored\n",
@@ -147,7 +147,7 @@
147147
"(x_train, y_train), (x_test, y_test) = tf.keras.datasets.cifar10.load_data()\n",
148148
"\n",
149149
"# Setting input specific parameters\n",
150-
"# The model expects input of dimetions of (INPUT_IMG_SIZE, INPUT_IMG_SIZE, 3)\n",
150+
"# The model expects input of dimension (INPUT_IMG_SIZE, INPUT_IMG_SIZE, 3)\n",
151151
"INPUT_IMG_SIZE = 32\n",
152152
"NUM_CLASSES = 10"
153153
]
@@ -286,7 +286,7 @@
286286
"source": [
287287
"## Start the remote training\n",
288288
"\n",
289-
"This step will prepare your code from this notebook for remote execution and starts a distributed training remotely on Google Cloud Platfrom to train the model. Once the job is submitted you can go to the next step to monitor the jobs progress via Tensorboard.\n"
289+
"This step will prepare your code from this notebook for remote execution and starts a distributed training remotely on Google Cloud Platform to train the model. Once the job is submitted you can go to the next step to monitor the jobs progress via Tensorboard.\n"
290290
]
291291
},
292292
{
@@ -298,31 +298,28 @@
298298
},
299299
"outputs": [],
300300
"source": [
301-
"if not tfc.remote():\n",
302-
" print('Training on TensorFlow Cloud...')\n",
303-
"\n",
304-
" # If you are using a custom image you can install modules via requirements\n",
305-
" # txt file.\n",
306-
" with open('requirements.txt','w') as f:\n",
307-
" f.write('tensorflow-cloud==0.1.12\\n')\n",
301+
"# If you are using a custom image you can install modules via requirements\n",
302+
"# txt file.\n",
303+
"with open('requirements.txt','w') as f:\n",
304+
" f.write('tensorflow-cloud==0.1.12\\n')\n",
308305
"\n",
309-
" # Optional: Some recommended base images. If you provide none the system\n",
310-
" # will choose one for you.\n",
311-
" TF_GPU_IMAGE= \"tensorflow/tensorflow:latest-gpu\"\n",
312-
" TF_CPU_IMAGE= \"tensorflow/tensorflow:latest\"\n",
306+
"# Optional: Some recommended base images. If you provide none the system\n",
307+
"# will choose one for you.\n",
308+
"TF_GPU_IMAGE= \"tensorflow/tensorflow:latest-gpu\"\n",
309+
"TF_CPU_IMAGE= \"tensorflow/tensorflow:latest\"\n",
313310
"\n",
314-
" tfc.run(\n",
315-
" distribution_strategy='auto',\n",
316-
" requirements_txt='requirements.txt',\n",
317-
" docker_config=tfc.DockerConfig(\n",
318-
" parent_image=TF_GPU_IMAGE,\n",
319-
" image_build_bucket=GCS_BUCKET\n",
320-
" ),\n",
321-
" chief_config=tfc.COMMON_MACHINE_CONFIGS['K80_1X'],\n",
322-
" worker_config=tfc.COMMON_MACHINE_CONFIGS['K80_1X'],\n",
323-
" worker_count=3,\n",
324-
" job_labels={'job': JOB_NAME}\n",
325-
" )"
311+
"tfc.run(\n",
312+
" distribution_strategy='auto',\n",
313+
" requirements_txt='requirements.txt',\n",
314+
" docker_config=tfc.DockerConfig(\n",
315+
" parent_image=TF_GPU_IMAGE,\n",
316+
" image_build_bucket=GCS_BUCKET\n",
317+
" ),\n",
318+
" chief_config=tfc.COMMON_MACHINE_CONFIGS['K80_1X'],\n",
319+
" worker_config=tfc.COMMON_MACHINE_CONFIGS['K80_1X'],\n",
320+
" worker_count=3,\n",
321+
" job_labels={'job': JOB_NAME}\n",
322+
")"
326323
]
327324
},
328325
{
@@ -332,7 +329,7 @@
332329
},
333330
"source": [
334331
"# Training Results\n",
335-
"While the training is in progress you can use Tensorboard to view the results."
332+
"While the training is in progress you can use Tensorboard to view the results. Note the results will show only after your training has started. This may take a few minutes."
336333
]
337334
},
338335
{

examples/google_cloud_project_setup_instructions.ipynb

Lines changed: 32 additions & 71 deletions
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,7 @@
3838
"\n",
3939
"To start go to https://cloud.google.com/ and click on “Get Started For Free\". This is a two step sign up process where you will need to provide your name, address and a credit card. The starter account is free and it comes with $300 credit that you can use. For this step you will need to provide a Google Account ( i.e. your Gmail account) to sign in.\n",
4040
"\n",
41-
"After completing the sign up process you will be redirected to [Google Cloud Platform welcome page](https://console.cloud.google.com/home/dashboard). click on the \"Home\" tab and make a note of your Project ID."
41+
"After completing the sign up process you will be redirected to [Google Cloud Platform welcome page](https://console.cloud.google.com/home/dashboard). click on the \"Home\" tab and make a note of your Project ID and Project number. (see [Identifying projects](https://cloud.google.com/resource-manager/docs/creating-managing-projects#identifying_projects))"
4242
]
4343
},
4444
{
@@ -49,7 +49,8 @@
4949
},
5050
"outputs": [],
5151
"source": [
52-
"GCP_PROJECT_ID = 'YOUR_PROJECT_ID'"
52+
"GCP_PROJECT_ID = 'YOUR_PROJECT_ID'\n",
53+
"PROJECT_NUMBER = 'YOUR_PROJECT_NUMBER'"
5354
]
5455
},
5556
{
@@ -63,7 +64,7 @@
6364
"\n",
6465
"* 2.1. Auth for Kaggle notebooks\n",
6566
"* 2.2. Auth for Colab notebook\n",
66-
"* 2.3. Auth for Cloud AI Notebooks - No action needed move to step 3."
67+
"* 2.3. Auth for Cloud AI Notebooks - Not supported."
6768
]
6869
},
6970
{
@@ -145,7 +146,9 @@
145146
"id": "DMqNVFx6-E_F"
146147
},
147148
"source": [
148-
"Use your Billing Account_ID from above and run the following to link your billing account with your project."
149+
"Use your Billing Account_ID from above and run the following to link your billing account with your project. \n",
150+
"\n",
151+
"Note if you use an existing project you may not see an Account_ID, this means you do not have the proper permissions to run the following commands, contact your admin or create a new project."
149152
]
150153
},
151154
{
@@ -158,6 +161,7 @@
158161
"outputs": [],
159162
"source": [
160163
"BILLING_ACCOUNT_ID = 'YOUR_BILLING_ACCOUNT_ID'\n",
164+
"\n",
161165
"!gcloud beta billing projects link $GCP_PROJECT_ID --billing-account $BILLING_ACCOUNT_ID"
162166
]
163167
},
@@ -204,6 +208,7 @@
204208
"outputs": [],
205209
"source": [
206210
"BUCKET_NAME = 'YOUR_BUCKET_NAME'\n",
211+
"\n",
207212
"GCS_BUCKET = f'gs://{BUCKET_NAME}'\n",
208213
"!gsutil mb -p $GCP_PROJECT_ID $GCS_BUCKET"
209214
]
@@ -216,7 +221,7 @@
216221
"source": [
217222
"## Create a service account for HP Tuning jobs\n",
218223
"This step is required to use HP Tuning on Google Cloud using CloudTuner.\n",
219-
"To [create a service account](https://cloud.google.com/iam/docs/creating-managing-service-accounts#iam-service-accounts-create-gcloud) run the following command and make a note of your service account name."
224+
"To [create a service account](https://cloud.google.com/iam/docs/creating-managing-service-accounts#iam-service-accounts-create-gcloud) and give it project editor access run the following command and make a note of your service account name."
220225
]
221226
},
222227
{
@@ -229,9 +234,12 @@
229234
"outputs": [],
230235
"source": [
231236
"SERVICE_ACCOUNT_NAME ='YOUR_SERVICE_ACCOUNT_NAME'\n",
232-
"SERVICE_ACCOUNT_EMAIL = f'{SERVICE_ACCOUNT_NAME}@{GCP_PROJECT_ID}.iam.gserviceaccount.com'\n",
233237
"\n",
234-
"!gcloud iam --project $GCP_PROJECT_ID service-accounts create $SERVICE_ACCOUNT_NAME"
238+
"SERVICE_ACCOUNT_EMAIL = f'{SERVICE_ACCOUNT_NAME}@{GCP_PROJECT_ID}.iam.gserviceaccount.com'\n",
239+
"!gcloud iam --project $GCP_PROJECT_ID service-accounts create $SERVICE_ACCOUNT_NAME\n",
240+
"!gcloud projects add-iam-policy-binding $GCP_PROJECT_ID \\\n",
241+
" --member serviceAccount:$SERVICE_ACCOUNT_EMAIL \\\n",
242+
" --role=roles/editor"
235243
]
236244
},
237245
{
@@ -240,27 +248,7 @@
240248
"id": "a-fNtK6rvGmg"
241249
},
242250
"source": [
243-
"The [`default AI Platform service account`](https://cloud.google.com/ai-platform/training/docs/custom-service-account#default) is identified by an email address with the format `service-PROJECT_NUMBER@cloud-ml.google.com.iam.gserviceaccount.com`. Run the following command to get your PROJECT_NUMBER."
244-
]
245-
},
246-
{
247-
"cell_type": "code",
248-
"execution_count": null,
249-
"metadata": {
250-
"id": "4MZGiPZnysMo"
251-
},
252-
"outputs": [],
253-
"source": [
254-
"!gcloud projects describe $GCP_PROJECT_ID |grep projectNumber"
255-
]
256-
},
257-
{
258-
"cell_type": "markdown",
259-
"metadata": {
260-
"id": "hfS6Erynz9Tx"
261-
},
262-
"source": [
263-
"Use the project number above to construct the service account email."
251+
"The [`default AI Platform service account`](https://cloud.google.com/ai-platform/training/docs/custom-service-account#default) is identified by an email address with the format `service-PROJECT_NUMBER@cloud-ml.google.com.iam.gserviceaccount.com`. Using your Project number from step one, we construct the service account email and grant the [`default AI Platform service account`](https://cloud.google.com/ai-platform/training/docs/custom-service-account#default) admin role (roles/iam.serviceAccountAdmin) on your new service account."
264252
]
265253
},
266254
{
@@ -271,68 +259,37 @@
271259
},
272260
"outputs": [],
273261
"source": [
274-
"PROJECT_NUMBER = 'YOUR_PROJECT_NUMBER'\n",
275-
"DEFAULT_AI_PLATFORM_SERVICE_ACCOUNT = f'service-{PROJECT_NUMBER}@cloud-ml.google.com.iam.gserviceaccount.com'"
276-
]
277-
},
278-
{
279-
"cell_type": "markdown",
280-
"metadata": {
281-
"id": "ySCk0NIF3lux"
282-
},
283-
"source": [
284-
"Grant the [`default AI Platform service account`](https://cloud.google.com/ai-platform/training/docs/custom-service-account#default) admin role (roles/iam.serviceAccountAdmin) on your new service account."
285-
]
286-
},
287-
{
288-
"cell_type": "code",
289-
"execution_count": null,
290-
"metadata": {
291-
"id": "l9HL0bYxuzWL"
292-
},
293-
"outputs": [],
294-
"source": [
262+
"DEFAULT_AI_PLATFORM_SERVICE_ACCOUNT = f'service-{PROJECT_NUMBER}@cloud-ml.google.com.iam.gserviceaccount.com'\n",
263+
"\n",
295264
"!gcloud iam --project $GCP_PROJECT_ID service-accounts add-iam-policy-binding \\\n",
296265
"--role=roles/iam.serviceAccountAdmin \\\n",
297266
"--member=serviceAccount:$DEFAULT_AI_PLATFORM_SERVICE_ACCOUNT \\\n",
298267
"$SERVICE_ACCOUNT_EMAIL"
299268
]
300269
},
301270
{
302-
"cell_type": "code",
303-
"execution_count": null,
271+
"cell_type": "markdown",
304272
"metadata": {
305-
"id": "g9fJh8RX-E_H",
306-
"trusted": true
273+
"id": "fqbjpPt_-E_H"
307274
},
308-
"outputs": [],
309275
"source": [
310-
"Finally run the following to allow the service account to impersonate your your users account."
276+
"## Congratulations !\n",
277+
"You are now ready to run tensorflow-cloud. Note that these steps only need to be run one time. Once you have your project setup you can reuse the same project and bucket configuration for future runs. For any new notebooks you will need to repeat the step two to add your Google Cloud auth credentials.\n",
278+
"\n",
279+
"Make a note of the following values as they are needed to run tensorflow-cloud."
311280
]
312281
},
313282
{
314283
"cell_type": "code",
315284
"execution_count": null,
316285
"metadata": {
317-
"id": "UeTwSx75-E_H",
318-
"trusted": true
286+
"id": "yE1gpL8oUdV8"
319287
},
320288
"outputs": [],
321289
"source": [
322-
"!gcloud iam service-accounts --project $GCP_PROJECT_ID add-iam-policy-binding \\\n",
323-
" $SERVICE_ACCOUNT_EMAIL \\\n",
324-
" --member=\"user:[email protected]\" \\\n",
325-
" --role=\"roles/iam.serviceAccountUser\""
326-
]
327-
},
328-
{
329-
"cell_type": "markdown",
330-
"metadata": {
331-
"id": "fqbjpPt_-E_H"
332-
},
333-
"source": [
334-
"## Congratulations !\n",
335-
"You are now ready to run tensorflow-cloud. Note that these steps only need to be run one time. Once you have your project setup you can reuse the same project and bucket configuration for future runs. For any new notebooks you will need to repeat the step two to add your Google Cloud auth credentials. "
290+
"print(f\"Your GCP_PROJECT_ID is: {GCP_PROJECT_ID}\")\n",
291+
"print(f\"Your SERVICE_ACCOUNT_NAME is: {SERVICE_ACCOUNT_NAME}\")\n",
292+
"print(f\"Your BUCKET_NAME is: {BUCKET_NAME}\")"
336293
]
337294
}
338295
],
@@ -345,6 +302,10 @@
345302
},
346303
"name": "google-cloud-project-setup-instructions.ipynb",
347304
"provenance": [
305+
{
306+
"file_id": "/piper/depot/google3/third_party/tensorflow_cloud/examples/google_cloud_project_setup_instructions.ipynb?workspaceId=chavoshi:tfc_examples::citc",
307+
"timestamp": 1614659451764
308+
},
348309
{
349310
"file_id": "/piper/depot/google3/third_party/tensorflow_cloud/examples/google_cloud_project_setup_instructions.ipynb?workspaceId=chavoshi:tensorflow_cloud::citc",
350311
"timestamp": 1613079100054

0 commit comments

Comments
 (0)