|
132 | 132 | "# @markdown 3. For serving, **[click here](https://console.cloud.google.com/iam-admin/quotas?location=us-central1&metric=aiplatform.googleapis.com%2Fcustom_model_serving_nvidia_l4_gpus)** to check if your project already has the required 1 L4 GPU in the us-central1 region. If yes, then run this notebook in the us-central1 region. If you need more L4 GPUs for your project, then you can follow [these instructions](https://cloud.google.com/docs/quotas/view-manage#viewing_your_quota_console) to request more. Alternatively, if you want to run predictions with A100 80GB or H100 GPUs, we recommend using the regions listed below. **NOTE:** Make sure you have associated quota in selected regions. Click the links to see your current quota for each GPU type: [Nvidia A100 80GB](https://console.cloud.google.com/iam-admin/quotas?metric=aiplatform.googleapis.com%2Fcustom_model_serving_nvidia_a100_80gb_gpus), [Nvidia H100 80GB](https://console.cloud.google.com/iam-admin/quotas?metric=aiplatform.googleapis.com%2Fcustom_model_serving_nvidia_h100_gpus).\n", |
133 | 133 | "\n", |
134 | 134 | "# @markdown > | Machine Type | Accelerator Type | Recommended Regions |\n", |
135 | | - "# @markdown | ----------- | ----------- | ----------- | \n", |
| 135 | + "# @markdown | ----------- | ----------- | ----------- |\n", |
136 | 136 | "# @markdown | a2-ultragpu-1g | 1 NVIDIA_A100_80GB | us-central1, us-east4, europe-west4, asia-southeast1, us-east4 |\n", |
137 | | - "# @markdown | a3-highgpu-2g | 2 NVIDIA_H100_80GB | us-west1, asia-southeast1 |\n", |
138 | | - "# @markdown | a3-highgpu-4g | 4 NVIDIA_H100_80GB | us-west1, asia-southeast1 |\n", |
139 | 137 | "# @markdown | a3-highgpu-8g | 8 NVIDIA_H100_80GB | us-central1, us-west1, europe-west4, asia-southeast1 |\n", |
140 | 138 | "\n", |
141 | 139 | "# @markdown 4. **[Optional]** [Create a Cloud Storage bucket](https://cloud.google.com/storage/docs/creating-buckets) for storing experiment outputs. Set the BUCKET_URI for the experiment environment. The specified Cloud Storage bucket (`BUCKET_URI`) should be located in the same region as where the notebook was launched. Note that a multi-region bucket (eg. \"us\") is not considered a match for a single region covered by the multi-region range (eg. \"us-central1\"). If not set, a unique GCS bucket will be created instead.\n", |
|
180 | 178 | "# Cloud Storage bucket for storing the experiment artifacts.\n", |
181 | 179 | "# A unique GCS bucket will be created for the purpose of this notebook. If you\n", |
182 | 180 | "# prefer using your own GCS bucket, change the value yourself below.\n", |
183 | | - "now = datetime.now().strftime(\"%Y%m%d%H%M%S\")\n", |
| 181 | + "now = datetime.datetime.now().strftime(\"%Y%m%d%H%M%S\")\n", |
184 | 182 | "BUCKET_NAME = \"/\".join(BUCKET_URI.split(\"/\")[:3])\n", |
185 | 183 | "\n", |
186 | 184 | "if BUCKET_URI is None or BUCKET_URI.strip() == \"\" or BUCKET_URI == \"gs://\":\n", |
|
582 | 580 | "outputs": [], |
583 | 581 | "source": [ |
584 | 582 | "# @title Run TensorBoard\n", |
585 | | - "# @markdown This section shows how to launch TensorBoard in a [Cloud Shell](https://cloud.google.com/shell/docs).\n", |
586 | | - "# @markdown 1. Click the Cloud Shell icon() on the top right to open the Cloud Shell.\n", |
587 | | - "# @markdown 2. Copy the `tensorboard` command shown below by running this cell.\n", |
588 | | - "# @markdown 3. Paste and run the command in the Cloud Shell to launch TensorBoard.\n", |
589 | | - "# @markdown 4. Once the command runs (You may have to click `Authorize` if prompted), click the link starting with `http://localhost`.\n", |
590 | | - "\n", |
| 583 | + "# @markdown This section launches TensorBoard and displays it. You can re-run the cell to display an updated information about the training job.\n", |
| 584 | + "# @markdown See the link to the training job in the above cell to see the status of the Custom Training Job.\n", |
591 | 585 | "# @markdown Note: You may need to wait around 10 minutes after the job starts in order for the TensorBoard logs to be written to the GCS bucket.\n", |
592 | | - "print(f\"Command to copy: tensorboard --logdir {base_output_dir}/logs\")\n" |
| 586 | + "\n", |
| 587 | + "now = datetime.datetime.now(tz=datetime.timezone.utc)\n", |
| 588 | + "\n", |
| 589 | + "if train_job.end_time is not None:\n", |
| 590 | + " min_since_end = int((now - train_job.end_time).total_seconds() // 60)\n", |
| 591 | + " print(f\"Training Job finished {min_since_end} minutes ago.\")\n", |
| 592 | + "\n", |
| 593 | + "if train_job.has_failed:\n", |
| 594 | + " print(\n", |
| 595 | + " \"The job has failed. See the link to the training job in the above cell to see the logs.\"\n", |
| 596 | + " )\n", |
| 597 | + "\n", |
| 598 | + "%tensorboard --logdir {base_output_dir}/logs" |
593 | 599 | ] |
594 | 600 | }, |
595 | 601 | { |
|
819 | 825 | "# endpoint = aiplatform.Endpoint(aip_endpoint_name)\n", |
820 | 826 | "\n", |
821 | 827 | "prompt = \"What is a car?\" # @param {type: \"string\"}\n", |
822 | | - "# @markdown If you encounter the issue like `ServiceUnavailable: 503 Took too long to respond when processing`, you can reduce the maximum number of output tokens, by lowering `max_tokens`.\n", |
| 828 | + "# @markdown If you encounter an issue like `ServiceUnavailable: 503 Took too long to respond when processing`, you can reduce the maximum number of output tokens, by lowering `max_tokens`.\n", |
823 | 829 | "max_tokens = 50 # @param {type:\"integer\"}\n", |
824 | 830 | "temperature = 1.0 # @param {type:\"number\"}\n", |
825 | 831 | "top_p = 1.0 # @param {type:\"number\"}\n", |
826 | 832 | "top_k = 1 # @param {type:\"integer\"}\n", |
| 833 | + "# @markdown Set `raw_response` to `True` to obtain the raw model output. Set `raw_response` to `False` to apply additional formatting in the structure of `\"Prompt:\\n{prompt.strip()}\\nOutput:\\n{output}\"`.\n", |
827 | 834 | "raw_response = False # @param {type:\"boolean\"}\n", |
828 | 835 | "\n", |
829 | 836 | "# Overrides parameters for inferences.\n", |
|
0 commit comments