MicrosoftDocs
diff --git a/‎articles/machine-learning/how-to-inference-server-http.md‎
Lines changed: 178 additions & 179 deletions b/‎articles/machine-learning/how-to-inference-server-http.md‎
Lines changed: 178 additions & 179 deletions
diff --git a/‎articles/machine-learning/how-to-troubleshoot-online-endpoints.md‎
Lines changed: 1 addition & 1 deletion b/‎articles/machine-learning/how-to-troubleshoot-online-endpoints.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎articles/machine-learning/includes/machine-learning-inference-server-troubleshooting.md‎
Lines changed: 35 additions & 32 deletions b/‎articles/machine-learning/includes/machine-learning-inference-server-troubleshooting.md‎
Lines changed: 35 additions & 32 deletions
diff --git a/‎articles/machine-learning/media/how-to-inference-server-http/debug-attach-pid.png‎
1.52 KB b/‎articles/machine-learning/media/how-to-inference-server-http/debug-attach-pid.png‎
1.52 KB
diff --git a/‎articles/machine-learning/media/how-to-inference-server-http/inference-server-architecture.png‎
9.05 KB b/‎articles/machine-learning/media/how-to-inference-server-http/inference-server-architecture.png‎
9.05 KB
diff --git a/‎articles/machine-learning/reference-yaml-deployment-managed-online.md‎
Lines changed: 1 addition & 1 deletion b/‎articles/machine-learning/reference-yaml-deployment-managed-online.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎articles/machine-learning/toc.yml‎
Lines changed: 1 addition & 1 deletion b/‎articles/machine-learning/toc.yml‎
Lines changed: 1 addition & 1 deletion
@@ -474,7 +474,7 @@ To run the *score.py* file you provide as part of the deployment, Azure creates
 
 - There's an error in the container environment setup, such as a missing dependency.
 
-  If you get the `TypeError: register() takes 3 positional arguments but 4 were given` error, check the dependency between flask v2 and `azureml-inference-server-http`. For more information, see [Troubleshoot HTTP server issues](how-to-inference-server-http.md#typeerror-during-server-startup).
+  If you get the `TypeError: register() takes 3 positional arguments but 4 were given` error, check the dependency between flask v2 and `azureml-inference-server-http`. For more information, see [Troubleshoot HTTP server issues](how-to-inference-server-http.md#typeerror-during-inference-server-startup).
 
 ### ERROR: ResourceNotFound
 
 
@@ -2,41 +2,44 @@
 author: shohei1029
 ms.service: azure-machine-learning
 ms.topic: include
-ms.date: 09/17/2024
+ms.date: 01/06/2025
 ms.author: shnagata
 ---
 
 <a name="frequently-asked-questions"></a>
 ### Check installed packages
 
-Follow these steps to address issues with installed packages.
+Follow these steps to address issues with installed packages:
 
 1. Gather information about installed packages and versions for your Python environment.
 
-1. Confirm that the `azureml-inference-server-http` Python package version specified in the environment file matches the Azure Machine Learning inference HTTP server version displayed in the [startup log](../how-to-inference-server-http.md#view-startup-logs).
+1. In your environment file, check the version of the `azureml-inference-server-http` Python package that's specified. In the Azure Machine Learning inference HTTP server [startup logs](../how-to-inference-server-http.md#view-startup-logs), check the version of the inference server that's displayed. Confirm that the two versions match.
 
    In some cases, the pip dependency resolver installs unexpected package versions. You might need to run `pip` to correct installed packages and versions.
 
-1. If you specify the Flask or its dependencies in your environment, remove these items.
+1. If you specify Flask or its dependencies in your environment, remove these items.
 
    - Dependent packages include `flask`, `jinja2`, `itsdangerous`, `werkzeug`, `markupsafe`, and `click`.
-   - `flask` is listed as a dependency in the server package. The best approach is to allow the inference server to install the `flask` package.
-   - When the inference server is configured to support new versions of Flask, the server automatically receives the package updates as they become available.
+   - The `flask` package is listed as a dependency in the inference server package. The best approach is to allow the inference server to install the `flask` package.
+   - When the inference server is configured to support new versions of Flask, the inference server automatically receives the package updates as they become available.
 
-### Check server version
+### Check the inference server version
 
-The `azureml-inference-server-http` server package is published to PyPI. The [PyPI page](https://pypi.org/project/azureml-inference-server-http/) lists the changelog and all previous versions.
+The `azureml-inference-server-http` server package is published to PyPI. The [PyPI page](https://pypi.org/project/azureml-inference-server-http/) lists the changelog and all versions of the package.
 
-If you're using an earlier package version, update your configuration to the latest version. The following table summarizes stable versions, common issues, and recommended adjustments:
+If you use an early package version, update your configuration to the latest version. The following table summarizes stable versions, common issues, and recommended adjustments:
 
 | Package version | Description | Issue | Resolution |
-| --- | --- | --- |
-| **0.4.x** | Bundled in training images dated `20220601` or earlier and `azureml-defaults` package versions `.1.34` through `1.43`. Latest stable version is **0.4.13**. | For server versions earlier than **0.4.11**, you might encounter Flask dependency issues, such as `"can't import name Markup from jinja2"`. | Upgrade to version **0.4.13** or **0.8.x**, the latest version, if possible. |
-| **0.6.x** | Preinstalled in inferencing images dated `20220516` and earlier. Latest stable version is **0.6.1**. | N/A | N/A |
-| **0.7.x** | Supports Flask 2. Latest stable version is **0.7.7**. | N/A | N/A |
-| **0.8.x** | Log format changed. Python 3.6 support ended. | N/A | N/A |
-
-<!-- Reviewer: Confirm if other versions or common issues + resolutions should be listed. The last major update to this topic was about 2 years ago. -->
+| --- | --- | --- | --- |
+| 0.4.x | Bundled in training images dated `20220601` or earlier and `azureml-defaults` package versions 0.1.34 through 1.43. Latest stable version is 0.4.13. | For server versions earlier than 0.4.11, you might encounter Flask dependency issues, such as `can't import name Markup from jinja2`. | Upgrade to version 0.4.13 or 1.4.x, the latest version, if possible. |
+| 0.6.x | Preinstalled in inferencing images dated `20220516` and earlier. Latest stable version is 0.6.1. | N/A | N/A |
+| 0.7.x | Supports Flask 2. Latest stable version is 0.7.7. | N/A | N/A |
+| 0.8.x | Uses an updated log format. Ends support for Python 3.6. | N/A | N/A |
+| 1.0.x | Ends support for Python 3.7. | N/A | N/A |
+| 1.1.x | Migrates to `pydantic` 2.0. | N/A | N/A |
+| 1.2.x | Adds support for Python 3.11. Updates `gunicorn` to version 22.0.0. Updates `werkzeug` to version 3.0.3 and later versions. | N/A | N/A |
+| 1.3.x | Adds support for Python 3.12. Upgrades `certifi` to version 2024.7.4. Upgrades `flask-cors` to version 5.0.0. Upgrades the `gunicorn` and `pydantic` packages. | N/A | N/A |
+| 1.4.x | Upgrades `waitress` to version 3.0.1. Ends support for Python 3.8. Removes the compatibility layer that prevents the Flask 2.0 upgrade from breaking request object code. | If you depend on the compatibility layer, your request object code might not work. | Migrate your score script to Flask 2. |
 
 ### Check package dependencies
 
@@ -46,14 +49,14 @@ The most relevant dependent packages for the `azureml-inference-server-http` ser
 - `opencensus-ext-azure`
 - `inference-schema`
 
-If you specified the `azureml-defaults` package in your Python environment, the `azureml-inference-server-http` package is a dependent package. The dependency is installed automatically.
+If you specify the `azureml-defaults` package in your Python environment, the `azureml-inference-server-http` package is a dependent package. The dependency is installed automatically.
 
 > [!TIP]
-> If you use Python SDK v1 and don't explicitly specify the `azureml-defaults` package in your Python environment, the SDK might automatically add the package. However, the packager version is locked relative to the SDK version. For example, if the SDK version is `1.38.0`, then the `azureml-defaults==1.38.0` entry is added to the environment's pip requirements.
+> If you use the Azure Machine Learning SDK for Python v1 and don't explicitly specify the `azureml-defaults` package in your Python environment, the SDK might automatically add the package. However, the package version is locked relative to the SDK version. For example, if the SDK version is 1.38.0, the `azureml-defaults==1.38.0` entry is added to the environment's pip requirements.
 
-### TypeError during server startup
+### TypeError during inference server startup
 
-You might encounter the following `TypeError` during server startup:
+You might encounter the following `TypeError` during inference server startup:
 
 ```bash
 TypeError: register() takes 3 positional arguments but 4 were given
@@ -65,12 +68,12 @@ TypeError: register() takes 3 positional arguments but 4 were given
 TypeError: register() takes 3 positional arguments but 4 were given
 ```
 
-This error occurs when you have Flask 2 installed in your Python environment, but your `azureml-inference-server-http` package version doesn't support Flask 2. Support for Flask 2 is available in `azureml-inference-server-http` package version **0.7.0** and later, and `azureml-defaults` package version **1.44** and later.
+This error occurs when you have Flask 2 installed in your Python environment, but your `azureml-inference-server-http` package version doesn't support Flask 2. Support for Flask 2 is available in the `azureml-inference-server-http` 0.7.0 package and later versions, and the `azureml-defaults` 1.44 package and later versions.
 
 - If you don't use the Flask 2 package in an Azure Machine Learning Docker image, use the latest version of the `azureml-inference-server-http` or `azureml-defaults` package.
-- If you use the Flask 2 package in an Azure Machine Learning Docker image, confirm that the image build version is **July 2022** or later.
+- If you use the Flask 2 package in an Azure Machine Learning Docker image, confirm that the image build version is `July 2022` or later.
 
-  You can find the image version in the container logs. For example:
+  You can find the image version in the container logs. For example, see the following log statements:
 
   ```console
   2022-08-22T17:05:02,147738763+00:00 | gunicorn/run | AzureML Container Runtime Information
@@ -82,24 +85,24 @@ This error occurs when you have Flask 2 installed in your Python environment, bu
   2022-08-22T17:05:02,190557998+00:00 | gunicorn/run | 
   ```
 
-  The build date of the image appears after the `Materialization Build` notation. In the preceding example, the image version is `20220708` or July 8, 2022. The image in this example is compatible with Flask 2.
+  The build date of the image appears after the `Materialization Build` notation. In the preceding example, the image version is `20220708`, or July 8, 2022. The image in this example is compatible with Flask 2.
 
-  If you don't see a similar message in your container log, your image is out-of-date and should be updated. If you use a Compute Unified Device Architecture (CUDA) image, and you can't find a newer image, check if your image is deprecated in [AzureML-Containers](https://github.com/Azure/AzureML-Containers). You can find designated replacements for deprecated images.
+  If you don't see a similar message in your container log, your image is out-of-date and should be updated. If you use a Compute Unified Device Architecture (CUDA) image and you can't find a newer image, check the [AzureML-Containers](https://github.com/Azure/AzureML-Containers) repo to see whether your image is deprecated. You can find designated replacements for deprecated images.
 
-  If you use the server with an online endpoint, you can also find the logs in the **Logs** on the **Endpoints** page in Azure Machine Learning studio.
+  If you use the inference server with an online endpoint, you can also find the logs in Azure Machine Learning studio. On the page for your endpoint, select the **Logs** tab.
 
-If you deploy with SDK v1, and don't explicitly specify an image in your deployment configuration, the server applies the `openmpi4.1.0-ubuntu20.04` package with a version that matches your local SDK toolset. However, the version installed might not be the latest available version of the image.
+If you deploy with the SDK v1 and don't explicitly specify an image in your deployment configuration, the inference server applies the `openmpi4.1.0-ubuntu20.04` package with a version that matches your local SDK toolset. However, the installed version might not be the latest available version of the image.
 
-For SDK version 1.43, the server installs the `openmpi4.1.0-ubuntu20.04:20220616` package version by default, but this package version isn't compatible with SDK 1.43. Make sure you use the latest SDK for your deployment.
+For SDK version 1.43, the inference server installs the `openmpi4.1.0-ubuntu20.04:20220616` package version by default, but this package version isn't compatible with SDK 1.43. Make sure you use the latest SDK for your deployment.
 
-If you can't update the image, you can temporarily avoid the issue by pinning the `azureml-defaults==1.43` or `azureml-inference-server-http~=0.4.13` entries in your environment file. These entries direct the server to install the older version with `flask 1.0.x`.
+If you can't update the image, you can temporarily avoid the issue by pinning the `azureml-defaults==1.43` or `azureml-inference-server-http~=0.4.13` entries in your environment file. These entries direct the inference server to install the older version with `flask 1.0.x`.
 
-### ImportError or ModuleNotFoundError during server startup
+### ImportError or ModuleNotFoundError during inference server startup
 
-You might encounter an `ImportError` or `ModuleNotFoundError` on specific modules, such as  `opencensus`, `jinja2`, `markupsafe`, or `click`, during server startup. The following example shows the error message:
+You might encounter an `ImportError` or `ModuleNotFoundError` on specific modules, such as  `opencensus`, `jinja2`, `markupsafe`, or `click`, during inference server startup. The following example shows the error message:
 
 ```bash
 ImportError: cannot import name 'Markup' from 'jinja2'
 ```
 
-The import and module errors occur when you use version **0.4.10** or earlier versions of the server that don't pin the Flask dependency to a compatible version. To prevent the issue, install a later version of the server.
+The import and module errors occur when you use version 0.4.10 or earlier versions of the inference server that don't pin the Flask dependency to a compatible version. To prevent the issue, install a later version of the inference server.
@@ -53,7 +53,7 @@ The source JSON schema can be found at https://azuremlschemas.azureedge.net/late
 | Key | Type | Description | Default value |
 | --- | ---- | ----------- | ------------- |
 | `request_timeout_ms` | integer | The scoring timeout in milliseconds. Note that the maximum value allowed is `180000` milliseconds. See [limits for online endpoints](how-to-manage-quotas.md#azure-machine-learning-online-endpoints-and-batch-endpoints) for more. | `5000` |
-| `max_concurrent_requests_per_instance` | integer | The maximum number of concurrent requests per instance allowed for the deployment. <br><br> **Note:** If you're using [Azure Machine Learning Inference Server](how-to-inference-server-http.md) or [Azure Machine Learning Inference Images](concept-prebuilt-docker-images-inference.md), your model must be configured to handle concurrent requests. To do so, pass `WORKER_COUNT: <int>` as an environment variable. For more information about `WORKER_COUNT`, see [Azure Machine Learning Inference Server Parameters](how-to-inference-server-http.md#review-server-parameters) <br><br> **Note:** Set to the number of requests that your model can process concurrently on a single node. Setting this value higher than your model's actual concurrency can lead to higher latencies. Setting this value too low might lead to under utilized nodes. Setting too low might also result in requests being rejected with a 429 HTTP status code, as the system will opt to fail fast. For more information, see [Troubleshooting online endpoints: HTTP status codes](how-to-troubleshoot-online-endpoints.md#http-status-codes). | `1` |
+| `max_concurrent_requests_per_instance` | integer | The maximum number of concurrent requests per instance allowed for the deployment. <br><br> **Note:** If you're using [Azure Machine Learning Inference Server](how-to-inference-server-http.md) or [Azure Machine Learning Inference Images](concept-prebuilt-docker-images-inference.md), your model must be configured to handle concurrent requests. To do so, pass `WORKER_COUNT: <int>` as an environment variable. For more information about `WORKER_COUNT`, see [Azure Machine Learning Inference Server Parameters](how-to-inference-server-http.md#review-inference-server-parameters) <br><br> **Note:** Set to the number of requests that your model can process concurrently on a single node. Setting this value higher than your model's actual concurrency can lead to higher latencies. Setting this value too low might lead to under utilized nodes. Setting too low might also result in requests being rejected with a 429 HTTP status code, as the system will opt to fail fast. For more information, see [Troubleshooting online endpoints: HTTP status codes](how-to-troubleshoot-online-endpoints.md#http-status-codes). | `1` |
 | `max_queue_wait_ms` | integer | (Deprecated) The maximum amount of time in milliseconds a request will stay in the queue. (Now increase `request_timeout_ms` to account for any networking/queue delays) | `500` |
 
 ### ProbeSettings
 
@@ -955,7 +955,7 @@ items:
           href: how-to-monitor-online-endpoints.md
         - name: Debug online endpoints locally VS Code
           href: how-to-debug-managed-online-endpoints-visual-studio-code.md
-        - name: Debug scoring script with inference HTTP server
+        - name: Debug scoring scripts with inference HTTP server
           href: how-to-inference-server-http.md
         - name: Troubleshoot online endpoints
           href: how-to-troubleshoot-online-endpoints.md