Add JumpStart time series forecasting notebook example (#4750)

shchur · web-flow · commit 43d1e0ad8cc8 · 2024-09-13T09:01:40.000-07:00
* Add JumpStart time series forecasting notebook example

* Fix line length

* Add pointer to the notebook

* Update index.rst
diff --git a/    generative_ai/README.md b/    generative_ai/README.md
@@ -26,3 +26,4 @@ These examples showcases Amazon SageMaker's capabilities in the exciting field o
 - [Retrieval-Augmented Generation: Question Answering using LangChain and Cohere's Generate and Embedding Models from SageMaker JumpStart](sm-jumpstart_rag_question_answering_with_cohere_and_langchain.ipynb)
 - [Introduction to JumpStart - Text to Image](sm-jumpstart_stable_diffusion_text_to_image.ipynb)
 - [Introduction to JumpStart - Text Embedding](sm-jumpstart_text_embedding.ipynb)
+- [Introduction to JumpStart - Time Series Forecasting](sm-jumpstart_time_series_forecasting.ipynb)
diff --git a/    generative_ai/sm-jumpstart_time_series_forecasting.ipynb b/    generative_ai/sm-jumpstart_time_series_forecasting.ipynb
@@ -0,0 +1,372 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "e960490f",
+   "metadata": {},
+   "source": [
+    "# Introduction to SageMaker JumpStart - Time Series Forecasting with Chronos"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "c1027fd6",
+   "metadata": {},
+   "source": [
+    "---\n",
+    "\n",
+    "This notebook's CI test result for us-west-2 is as follows. CI test results in other regions can be found at the end of the notebook. \n",
+    "\n",
+    "![This us-west-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/us-west-2/generative_ai|sm-jumpstart_time_series_forecasting.ipynb)\n",
+    "\n",
+    "---"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "6ecc10fd",
+   "metadata": {},
+   "source": [
+    "In this demo notebook, we demonstrate how to use the SageMaker Python SDK to deploy a SageMaker JumpStart time series forecasting model and invoke the endpoint."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "2b4e4586",
+   "metadata": {},
+   "source": [
+    "## Setup\n",
+    "First, upgrade to the latest sagemaker SDK to ensure all available models are deployable."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "id": "9f1c969a",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Note: you may need to restart the kernel to use updated packages.\n"
+     ]
+    }
+   ],
+   "source": [
+    "%pip install sagemaker --upgrade --quiet"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "b958b25b",
+   "metadata": {},
+   "source": [
+    "Select the desired model to deploy. The provided dropdown filters all time series forecasting models available in SageMaker JumpStart."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "id": "050856e9",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "sagemaker.config INFO - Not applying SDK defaults from location: /etc/xdg/sagemaker/config.yaml\n",
+      "sagemaker.config INFO - Not applying SDK defaults from location: /home/shchuro/.config/sagemaker/config.yaml\n"
+     ]
+    },
+    {
+     "data": {
+      "application/vnd.jupyter.widget-view+json": {
+       "model_id": "728a6ed928854101b1fc0616bc6ea4b3",
+       "version_major": 2,
+       "version_minor": 0
+      },
+      "text/plain": [
+       "Dropdown(description='Select a JumpStart time series forecasting model:', index=2, layout=Layout(width='max-co…"
+      ]
+     },
+     "metadata": {},
+     "output_type": "display_data"
+    }
+   ],
+   "source": [
+    "from ipywidgets import Dropdown\n",
+    "from sagemaker.jumpstart.notebook_utils import list_jumpstart_models\n",
+    "\n",
+    "dropdown = Dropdown(\n",
+    "    options=list_jumpstart_models(filter=\"task == forecasting\"),\n",
+    "    value=\"autogluon-forecasting-chronos-t5-small\",\n",
+    "    description=\"Select a JumpStart time series forecasting model:\",\n",
+    "    style={\"description_width\": \"initial\"},\n",
+    "    layout={\"width\": \"max-content\"},\n",
+    ")\n",
+    "display(dropdown)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "id": "1b257502",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "model_id = dropdown.value\n",
+    "model_version = \"*\""
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "6f2fc5d2",
+   "metadata": {},
+   "source": [
+    "## Deploy model\n",
+    "\n",
+    "Create a `JumpStartModel` object, which initializes default model configurations conditioned on the selected instance type. JumpStart already sets a default instance type, but you can deploy the model on other instance types by passing `instance_type` to the `JumpStartModel` class."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "id": "a4c5bb0e",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from sagemaker.jumpstart.model import JumpStartModel\n",
+    "\n",
+    "model = JumpStartModel(model_id=model_id, model_version=model_version)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "67eeeab7",
+   "metadata": {},
+   "source": [
+    "You can now deploy the model using SageMaker JumpStart. The deployment might take a few minutes. "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "id": "3c7726a9",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "----------!"
+     ]
+    }
+   ],
+   "source": [
+    "predictor = model.deploy()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "a4849068",
+   "metadata": {},
+   "source": [
+    "## Invoke the endpoint\n",
+    "\n",
+    "This section demonstrates how to invoke the endpoint using example payloads that are retrieved programmatically from the `JumpStartModel` object. You can replace these example payloads with your own payloads."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "id": "27637734",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from pprint import pformat\n",
+    "\n",
+    "\n",
+    "def nested_round(data, decimals=3):\n",
+    "    \"\"\"Round numbers, including nested dicts and list.\"\"\"\n",
+    "    if isinstance(data, float):\n",
+    "        return round(data, decimals)\n",
+    "    elif isinstance(data, list):\n",
+    "        return [nested_round(item, decimals) for item in data]\n",
+    "    elif isinstance(data, dict):\n",
+    "        return {key: nested_round(value, decimals) for key, value in data.items()}\n",
+    "    else:\n",
+    "        return data\n",
+    "\n",
+    "\n",
+    "def pretty_format(data):\n",
+    "    return pformat(nested_round(data), width=150, sort_dicts=False)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 15,
+   "id": "775c2325",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Input:\n",
+      " {'inputs': [{'target': [0.0, 4.0, 5.0, 1.5, -3.0, -5.0, -3.0, 1.5, 5.0, 4.0, 0.0, -4.0, -5.0, -1.5, 3.0, 5.0, 3.0, -1.5, -5.0, -4.0]}],\n",
+      " 'parameters': {'prediction_length': 10}}\n",
+      "\n",
+      "Output:\n",
+      " {'predictions': [{'mean': [-0.488, 3.101, 3.086, 0.436, -2.867, -3.924, -2.258, 0.686, 2.456, 2.089],\n",
+      "                  '0.1': [-4.331, 1.351, -0.04, -3.467, -5.713, -5.051, -5.056, -3.247, -1.907, -1.898],\n",
+      "                  '0.5': [0.0, 3.507, 4.012, 0.0, -3.003, -4.903, -3.015, 1.501, 3.003, 3.003],\n",
+      "                  '0.9': [1.652, 4.997, 4.997, 4.11, 0.0, -2.215, 1.602, 4.997, 4.997, 4.997]}]}\n",
+      "\n",
+      "===============\n",
+      "\n",
+      "Input:\n",
+      " {'inputs': [{'target': [1.0, 2.0, 3.0, 2.0, 0.5, 2.0, 3.0, 2.0, 1.0], 'item_id': 'product_A', 'start': '2024-01-01T01:00:00'},\n",
+      "            {'target': [5.4, 3.0, 3.0, 2.0, 1.5, 2.0, -1.0], 'item_id': 'product_B', 'start': '2024-02-02T03:00:00'}],\n",
+      " 'parameters': {'prediction_length': 5, 'freq': '1h', 'quantile_levels': [0.05, 0.5, 0.95], 'num_samples': 30, 'batch_size': 2}}\n",
+      "\n",
+      "Output:\n",
+      " {'predictions': [{'mean': [1.731, 1.498, 1.764, 1.632, 1.465],\n",
+      "                  '0.05': [0.224, 0.497, 0.224, 0.0, 0.0],\n",
+      "                  '0.5': [0.995, 0.995, 1.25, 1.499, 0.995],\n",
+      "                  '0.95': [4.005, 2.997, 3.278, 3.999, 3.544],\n",
+      "                  'item_id': 'product_A',\n",
+      "                  'start': '2024-01-01T10:00:00'},\n",
+      "                 {'mean': [0.084, 0.916, 0.384, 1.205, 1.481],\n",
+      "                  '0.05': [-1.273, -0.726, -1.537, -0.358, -0.863],\n",
+      "                  '0.5': [0.0, 0.872, 0.0, 1.012, 1.059],\n",
+      "                  '0.95': [2.006, 3.109, 2.552, 3.0, 4.887],\n",
+      "                  'item_id': 'product_B',\n",
+      "                  'start': '2024-02-02T10:00:00'}]}\n",
+      "\n",
+      "===============\n",
+      "\n"
+     ]
+    }
+   ],
+   "source": [
+    "for payload in model.retrieve_all_examples():\n",
+    "    response = predictor.predict(payload.body)\n",
+    "    print(\"Input:\\n\", pretty_format(payload.body), end=\"\\n\\n\")\n",
+    "    print(\"Output:\\n\", pretty_format(response))\n",
+    "    print(\"\\n===============\\n\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "335471d8",
+   "metadata": {},
+   "source": [
+    "The payload for Chronos models must be structured as follows.\n",
+    "* **inputs** (required): List with at most 64 time series that need to be forecasted. Each time series is represented by a dictionary with the following keys:\n",
+    "    * **target** (required): List of observed numeric time series values. \n",
+    "        - It is recommended that each time series contains at least 30 observations.\n",
+    "        - If any time series contains fewer than 5 observations, an error will be raised.\n",
+    "    * **item_id**: String that uniquely identifies each time series. \n",
+    "        - If provided, the ID must be unique for each time series.\n",
+    "        - If provided, then the endpoint response will also include the **item_id** field for each forecast.\n",
+    "    * **start**: Timestamp of the first time series observation in ISO format (`YYYY-MM-DD` or `YYYY-MM-DDThh:mm:ss`). \n",
+    "        - If **start** field is provided, then **freq** must also be provided as part of **parameters**.\n",
+    "        - If provided, then the endpoint response will also include the **start** field indicating the first timestamp of each forecast.\n",
+    "* **parameters**: Optional parameters to configure the model.\n",
+    "    * **prediction_length**: Integer corresponding to the number of future time series values that need to be predicted. \n",
+    "        - Recommended to keep prediction_length <= 64 since larger values will result in inaccurate quantile forecasts. Values above 1000 will raise an error.\n",
+    "    * **quantile_levels**: List of floats in range (0, 1) specifying which quantiles should should be included in the probabilistic forecast. Defaults to `[0.1, 0.5, 0.9]`. \n",
+    "    * **freq**: Frequency of the time series observations in [pandas-compatible format](https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#offset-aliases). For example, `1h` for hourly data or `2W` for bi-weekly data. \n",
+    "        - If **freq** is provided, then **start** must also be provided for each time series in **inputs**.\n",
+    "    * **num_samples**: Number of sample trajectories generated by the Chronos model during inference. Larger values may improve accuracy but increase memory consumption and slow down inference. Defaults to `20`.\n",
+    "    * **batch_size**: Number of time series processed in parallel by the model. Larger values speed up inference but may lead to out of memory errors.\n",
+    "\n",
+    "All keys not marked with (required) are optional.\n",
+    "\n",
+    "The endpoint response contains the probabilistic (quantile) forecast for each time series included in the request."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "aa5c2c0a",
+   "metadata": {},
+   "source": [
+    "## Clean up the endpoint\n",
+    "Don't forget to clean up resources when finished to avoid unnecessary charges."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 16,
+   "id": "b1a4059e",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "predictor.delete_predictor()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "1b1d70fc",
+   "metadata": {},
+   "source": [
+    "## Notebook CI Test Results\n",
+    "\n",
+    "This notebook was tested in multiple regions. The test results are as follows, except for us-west-2 which is shown at the top of the notebook.\n",
+    "\n",
+    "![This us-east-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/us-east-1/generative_ai|sm-jumpstart_time_series_forecasting.ipynb)\n",
+    "\n",
+    "![This us-east-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/us-east-2/generative_ai|sm-jumpstart_time_series_forecasting.ipynb)\n",
+    "\n",
+    "![This us-west-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/us-west-1/generative_ai|sm-jumpstart_time_series_forecasting.ipynb)\n",
+    "\n",
+    "![This ca-central-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/ca-central-1/generative_ai|sm-jumpstart_time_series_forecasting.ipynb)\n",
+    "\n",
+    "![This sa-east-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/sa-east-1/generative_ai|sm-jumpstart_time_series_forecasting.ipynb)\n",
+    "\n",
+    "![This eu-west-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/eu-west-1/generative_ai|sm-jumpstart_time_series_forecasting.ipynb)\n",
+    "\n",
+    "![This eu-west-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/eu-west-2/generative_ai|sm-jumpstart_time_series_forecasting.ipynb)\n",
+    "\n",
+    "![This eu-west-3 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/eu-west-3/generative_ai|sm-jumpstart_time_series_forecasting.ipynb)\n",
+    "\n",
+    "![This eu-central-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/eu-central-1/generative_ai|sm-jumpstart_time_series_forecasting.ipynb)\n",
+    "\n",
+    "![This eu-north-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/eu-north-1/generative_ai|sm-jumpstart_time_series_forecasting.ipynb)\n",
+    "\n",
+    "![This ap-southeast-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/ap-southeast-1/generative_ai|sm-jumpstart_time_series_forecasting.ipynb)\n",
+    "\n",
+    "![This ap-southeast-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/ap-southeast-2/generative_ai|sm-jumpstart_time_series_forecasting.ipynb)\n",
+    "\n",
+    "![This ap-northeast-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/ap-northeast-1/generative_ai|sm-jumpstart_time_series_forecasting.ipynb)\n",
+    "\n",
+    "![This ap-northeast-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/ap-northeast-2/generative_ai|sm-jumpstart_time_series_forecasting.ipynb)\n",
+    "\n",
+    "![This ap-south-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/ap-south-1/generative_ai|sm-jumpstart_time_series_forecasting.ipynb)\n"
+   ]
+  }
+ ],
+ "metadata": {
+  "instance_type": "ml.t3.medium",
+  "kernelspec": {
+   "display_name": "ag",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.11.9"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
diff --git a/index.rst b/index.rst
@@ -185,6 +185,7 @@ We recommend the following notebooks as a broad introduction to the capabilities
    generative_ai/sm-jumpstart_rag_question_answering_with_cohere_and_langchain
    generative_ai/sm-jumpstart_stable_diffusion_text_to_image
    generative_ai/sm-jumpstart_text_embedding
+   generative_ai/sm-jumpstart_time_series_forecasting
 
 .. toctree::
    :maxdepth: 1