add vlm notebook to default branch (#4755)

yusharon · web-flow · commit 70bd3780b17d · 2024-09-25T14:57:50.000-04:00
diff --git a/    generative_ai/sm-jumpstart-llama_3_vision_language_model_deployment.ipynb b/    generative_ai/sm-jumpstart-llama_3_vision_language_model_deployment.ipynb
@@ -0,0 +1,281 @@
+{
+    "cells": [
+     {
+      "cell_type": "markdown",
+      "id": "caac7a72",
+      "metadata": {},
+      "source": [
+       "# SageMaker JumpStart - deploy llama 3.2 vision language model\n",
+       "\n",
+       "This notebook demonstrates how to use the SageMaker Python SDK to deploy a SageMaker JumpStart vision language model and invoke the endpoint."
+      ]
+     },
+     {
+        "cell_type": "markdown",
+        "id": "c1027fd6",
+        "metadata": {},
+        "source": [
+         "---\n",
+         "\n",
+         "This notebook's CI test result for us-west-2 is as follows. CI test results in other regions can be found at the end of the notebook. \n",
+         "\n",
+         "![This us-west-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/us-west-2/generative_ai|sm-jumpstart-llama_3_vision_language_model_deployment.ipynb)\n",
+         "\n",
+         "---"
+        ]
+       },
+     {
+      "cell_type": "code",
+      "execution_count": null,
+      "id": "c9d34879-bf56-4aeb-8a67-fa85c7b7a092",
+      "metadata": {},
+      "outputs": [],
+      "source": [
+       "from sagemaker.jumpstart.model import JumpStartModel"
+      ]
+     },
+     {
+      "cell_type": "markdown",
+      "id": "c1f55ea3-025d-4f77-83de-69ea3fe46cd5",
+      "metadata": {},
+      "source": [
+       "Select your desired model ID. You can search for available models in the [Built-in Algorithms with pre-trained Model Table](https://sagemaker.readthedocs.io/en/stable/doc_utils/pretrainedmodels.html)."
+      ]
+     },
+     {
+      "cell_type": "markdown",
+      "id": "7d458cf0-02e2-4066-927b-25fa5ef2a07e",
+      "metadata": {},
+      "source": [
+       "***\n",
+       "You can continue with the default model or choose a different model: this notebook will run with the following model IDs :\n",
+       "- `meta-vlm-llama-3-2-11b-vision`\n",
+       "- `meta-vlm-llama-3-2-11b-vision-instruct`\n",
+       "- `meta-vlm-llama-3-2-90b-vision`\n",
+       "- `meta-vlm-llama-3-2-90b-vision-instruct`\n",
+       "- `meta-vlm-llama-guard-3-11b-vision`\n",
+       "***"
+      ]
+     },
+     {
+      "cell_type": "code",
+      "execution_count": null,
+      "id": "3763cc4c",
+      "metadata": {
+       "jumpStartAlterations": [
+        "modelIdOnly"
+       ],
+       "tags": []
+      },
+      "outputs": [],
+      "source": [
+       "model_id = \"meta-vlm-llama-3-2-11b-vision\""
+      ]
+     },
+     {
+      "cell_type": "markdown",
+      "id": "a1af4672-93b5-4746-963f-c40cdd0ccb4d",
+      "metadata": {},
+      "source": [
+       "If your selected model is gated, you will need to set `accept_eula` to True to accept the model end-user license agreement (EULA)."
+      ]
+     },
+     {
+      "cell_type": "code",
+      "execution_count": null,
+      "id": "439bc3a3-15bc-4551-8c5d-8b592d298678",
+      "metadata": {},
+      "outputs": [],
+      "source": [
+       "accept_eula = False"
+      ]
+     },
+     {
+      "cell_type": "markdown",
+      "id": "2bc40011-51f3-4787-ae1b-6b50a594cdf4",
+      "metadata": {},
+      "source": [
+       "## Deploy model"
+      ]
+     },
+     {
+      "cell_type": "markdown",
+      "id": "76d26926-152e-456c-8bc0-5d671fd61dac",
+      "metadata": {},
+      "source": [
+       "Using the model ID, define your model as a JumpStart model. You can deploy the model on other instance types by passing `instance_type` to `JumpStartModel`. See [Deploy publicly available foundation models with the JumpStartModel class](https://docs.aws.amazon.com/sagemaker/latest/dg/jumpstart-foundation-models-use-python-sdk.html#jumpstart-foundation-models-use-python-sdk-model-class) for more configuration options."
+      ]
+     },
+     {
+      "cell_type": "code",
+      "execution_count": null,
+      "id": "85a2a8e5-789f-4041-9927-221257126653",
+      "metadata": {
+       "tags": []
+      },
+      "outputs": [],
+      "source": [
+       "model = JumpStartModel(model_id=model_id)"
+      ]
+     },
+     {
+      "cell_type": "markdown",
+      "id": "d314544f-e62e-4dfb-981c-659ee991791c",
+      "metadata": {},
+      "source": [
+       "You can now deploy your JumpStart model. The deployment might take few minutes."
+      ]
+     },
+     {
+      "cell_type": "code",
+      "execution_count": null,
+      "id": "56c7462a",
+      "metadata": {},
+      "outputs": [],
+      "source": [
+       "predictor = model.deploy(accept_eula=accept_eula)"
+      ]
+     },
+     {
+      "cell_type": "markdown",
+      "id": "a97bd778-5c62-4757-80ce-38c29275fa2a",
+      "metadata": {},
+      "source": [
+       "## Invoke endpoint"
+      ]
+     },
+     {
+      "cell_type": "markdown",
+      "id": "8ba7c729",
+      "metadata": {},
+      "source": [
+       "Programmatically retrieve example playloads from the `JumpStartModel` object."
+      ]
+     },
+     {
+      "cell_type": "code",
+      "execution_count": null,
+      "id": "7077afc0",
+      "metadata": {},
+      "outputs": [],
+      "source": [
+       "example_payloads = model.retrieve_all_examples()"
+      ]
+     },
+     {
+      "cell_type": "markdown",
+      "id": "7ee929ec-7707-4c5c-8530-a3ad20f2b2c2",
+      "metadata": {},
+      "source": [
+       "Now you can invoke the endpoint for each retrieved example payload."
+      ]
+     },
+     {
+      "cell_type": "code",
+      "execution_count": null,
+      "id": "bf5899c8",
+      "metadata": {},
+      "outputs": [],
+      "source": [
+       "for payload in example_payloads:\n",
+       "    response = predictor.predict(payload)\n",
+       "    response = response[0] if isinstance(response, list) else response\n",
+       "    print(\"Input:\\n\", payload.body, end=\"\\n\\n\")\n",
+       "    print(\"Output:\\n\", response[\"choices\"][0][\"message\"][\"content\"], end=\"\\n\\n\\n\")"
+      ]
+     },
+     {
+      "cell_type": "markdown",
+      "id": "4c63a4d6",
+      "metadata": {},
+      "source": [
+       "This model supports the following payload parameters. You may specify any subset of these parameters when invoking an endpoint.\n",
+       "\n",
+       "* **max_tokens:** Maximum number of generated tokens. If specified, it must be a positive integer.\n",
+       "* **temperature:** Controls the randomness in the output. Higher temperature results in output sequence with low-probability words and lower temperature results in output sequence with high-probability words. If `temperature` -> 0, it results in greedy decoding. If specified, it must be a positive float.\n",
+       "* **top_p:** In each step of text generation, sample from the smallest possible set of words with cumulative probability `top_p`. If specified, it must be a float between 0 and 1.\n",
+       "* **logprobs:** Log probabilities represent how likely each token is relative to the others based on the model's prediction. This is useful if you want to understand the confidence level of the model in its token choices or analyze the generation process at a finer level. If `logprobs` -> True, it tells the model to return the log probabilities of each token it generates.\n"
+      ]
+     },
+     {
+      "cell_type": "markdown",
+      "id": "adb5db3d",
+      "metadata": {},
+      "source": [
+       "## Clean up"
+      ]
+     },
+     {
+      "cell_type": "code",
+      "execution_count": null,
+      "id": "b2d027be",
+      "metadata": {},
+      "outputs": [],
+      "source": [
+       "predictor.delete_predictor()"
+      ]
+     },
+     {
+        "cell_type": "markdown",
+        "id": "1b1d70fc",
+        "metadata": {},
+        "source": [
+         "## Notebook CI Test Results\n",
+         "\n",
+         "This notebook was tested in multiple regions. The test results are as follows, except for us-west-2 which is shown at the top of the notebook.\n",
+         "\n",
+         "![This us-east-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/us-east-1/generative_ai|sm-jumpstart-llama_3_vision_language_model_deployment.ipynb)\n",
+         "\n",
+         "![This us-east-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/us-east-2/generative_ai|sm-jumpstart-llama_3_vision_language_model_deployment.ipynb)\n",
+         "\n",
+         "![This us-west-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/us-west-1/generative_ai|sm-jumpstart-llama_3_vision_language_model_deployment.ipynb)\n",
+         "\n",
+         "![This ca-central-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/ca-central-1/generative_ai|sm-jumpstart-llama_3_vision_language_model_deployment.ipynb)\n",
+         "\n",
+         "![This sa-east-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/sa-east-1/generative_ai|sm-jumpstart-llama_3_vision_language_model_deployment.ipynb)\n",
+         "\n",
+         "![This eu-west-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/eu-west-1/generative_ai|sm-jumpstart-llama_3_vision_language_model_deployment.ipynb)\n",
+         "\n",
+         "![This eu-west-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/eu-west-2/generative_ai|sm-jumpstart-llama_3_vision_language_model_deployment.ipynb)\n",
+         "\n",
+         "![This eu-west-3 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/eu-west-3/generative_ai|sm-jumpstart-llama_3_vision_language_model_deployment.ipynb)\n",
+         "\n",
+         "![This eu-central-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/eu-central-1/generative_ai|sm-jumpstart-llama_3_vision_language_model_deployment.ipynb)\n",
+         "\n",
+         "![This eu-north-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/eu-north-1/generative_ai|sm-jumpstart-llama_3_vision_language_model_deployment.ipynb)\n",
+         "\n",
+         "![This ap-southeast-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/ap-southeast-1/generative_ai|sm-jumpstart-llama_3_vision_language_model_deployment.ipynb)\n",
+         "\n",
+         "![This ap-southeast-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/ap-southeast-2/generative_ai|sm-jumpstart-llama_3_vision_language_model_deployment.ipynb)\n",
+         "\n",
+         "![This ap-northeast-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/ap-northeast-1/generative_ai|sm-jumpstart-llama_3_vision_language_model_deployment.ipynb)\n",
+         "\n",
+         "![This ap-northeast-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/ap-northeast-2/generative_ai|sm-jumpstart-llama_3_vision_language_model_deployment.ipynb)\n",
+         "\n",
+         "![This ap-south-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/ap-south-1/generative_ai|sm-jumpstart-llama_3_vision_language_model_deployment.ipynb)\n"
+        ]
+       }
+    ],
+    "metadata": {
+     "instance_type": "ml.t3.medium",
+     "kernelspec": {
+      "display_name": "Python 3 (ipykernel)",
+      "language": "python",
+      "name": "python3"
+     },
+     "language_info": {
+      "codemirror_mode": {
+       "name": "ipython",
+       "version": 3
+      },
+      "file_extension": ".py",
+      "mimetype": "text/x-python",
+      "name": "python",
+      "nbconvert_exporter": "python",
+      "pygments_lexer": "ipython3",
+      "version": "3.10.13"
+     }
+    },
+    "nbformat": 4,
+    "nbformat_minor": 5
+   }