You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/machine-learning/how-to-troubleshoot-online-endpoints.md
+4-4Lines changed: 4 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -26,7 +26,7 @@ The document structure reflects the way you should approach troubleshooting:
26
26
1. Use [container logs](#get-container-logs) to help debug issues.
27
27
1. Understand [common deployment errors](#common-deployment-errors) that might arise and how to fix them.
28
28
29
-
The [HTTP status codes](#http-status-codes) sections explains how invocation and prediction errors map to HTTP status codes when you score endpoints with REST requests.
29
+
The [HTTP status codes](#http-status-codes) sections explain how invocation and prediction errors map to HTTP status codes when you score endpoints with REST requests.
30
30
31
31
## Prerequisites
32
32
@@ -52,7 +52,7 @@ There are two supported tracing headers:
52
52
53
53
## Deploy locally
54
54
55
-
Local deployment means deploying a model to a local Docker environment. Local deployment supports creation, update, and deletion of a local endpoint, and allows you to invoke and get logs from the endpoint. Local deployment is useful for testing and debugging before deployment to the cloud.
55
+
Local deployment means to deploy a model to a local Docker environment. Local deployment supports creation, update, and deletion of a local endpoint, and allows you to invoke and get logs from the endpoint. Local deployment is useful for testing and debugging before deployment to the cloud.
56
56
57
57
> [!TIP]
58
58
> You can also use the [Azure Machine Learning inference HTTP server Python package](how-to-inference-server-http.md) to debug your scoring script locally. Debugging with the inference server helps you to debug the scoring script before deploying to local endpoints so that you can debug without being affected by the deployment container configurations.
@@ -715,7 +715,7 @@ The following table contains common error codes when REST requests consume Kuber
| 409 | Conflict error | When an operation is already in progress, any new operation on that same online endpoint responds with a 409 conflict error. For example, if a create or update online endpoint operation is in progress, triggering a new delete operation throws an error. |
717
717
| 502 | Exception or crash in the `run()` method of the *score.py* file | When there's an error in *score.py*, for example an imported package that doesn't exist in the conda environment, a syntax error, or a failure in the `init()` method, see [ERROR: ResourceNotReady](#error-resourcenotready) to debug the file.|
718
-
| 503 | Large spikes in requests per second | The autoscaler is designed to handle gradual changes in load. If you receive large spikes in requests per second, clients might receive HTTP status code 503. Even though the autoscaler reacts quickly, it takes AKS a significant amount of time to create more containers. See [How to prevent 503 status codes](#how-to-prevent-503-status-codes).|
718
+
| 503 | Large spikes in requests per second | The autoscaler is designed to handle gradual changes in load. If you receive large spikes in requests per second, clients might receive HTTP status code 503. Even though the autoscaler reacts quickly, it takes AKS a significant amount of time to create more containers. See [How to prevent 503 status code errors](#how-to-prevent-503-status-code-errors).|
719
719
| 504 | Request times out | A 504 status code indicates that the request timed out. The default timeout setting is 5 seconds. You can increase the timeout or try to speed up the endpoint by modifying *score.py* to remove unnecessary calls. If these actions don't correct the problem, the code might be in a nonresponsive state or an infinite loop. Follow [ERROR: ResourceNotReady](#error-resourcenotready) to debug the *score.py* file. |
720
720
| 500 | Internal server error | Azure Machine Learning-provisioned infrastructure is failing.|
721
721
@@ -781,7 +781,7 @@ To debug conda installation problems, try the following steps:
781
781
1. Install the mlflow conda file locally with the command `conda env create -n userenv -f <CONDA_ENV_FILENAME>`.
782
782
1. If there are errors locally, try resolving the conda environment and creating a functional one before redeploying.
783
783
1. If the container crashes even if it resolves locally, the SKU size used for deployment might be too small.
784
-
- Conda package installation occurs at runtime, so if the SKU size is too small to accommodate all the packages detailed in the *conda.yml* environment file, the container might crash.
784
+
- Conda package installation occurs at runtime, so if the SKU size is too small to accommodate all the packages in the *conda.yml* environment file, the container might crash.
785
785
- A Standard_F4s_v2 VM is a good starting SKU size, but you might need larger VMs depending on the dependencies the conda file specifies.
786
786
- For Kubernetes online endpoints, the Kubernetes cluster must have a minimum of four vCPU cores and 8 GB of memory.
Copy file name to clipboardExpand all lines: articles/machine-learning/includes/machine-learning-online-endpoint-troubleshooting.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -9,7 +9,7 @@ ms.author: larryfr
9
9
10
10
### Online endpoint creation fails with a V1LegacyMode == true message
11
11
12
-
You can configure the Azure Machine Learning workspace for `v1_legacy_mode`, which disables v2 APIs. Managed online endpoints are a feature of the v2 API platform, and won't work if `v1_legacy_mode` is enabled for the workspace.
12
+
You can configure the Azure Machine Learning workspace for `v1_legacy_mode`, which disables v2 APIs. Managed online endpoints are a feature of the v2 API platform, and don't work if `v1_legacy_mode` is enabled for the workspace.
13
13
14
14
To disable `v1_legacy_mode`, see [Network isolation with v2](../how-to-configure-network-isolation-with-v2.md).
15
15
@@ -75,7 +75,7 @@ If the value of `bypass` isn't `AzureServices`, use the guidance in the [Configu
75
75
76
76
#### Managed online endpoints
77
77
78
-
1. Use the following command to check whether an A record exists in the private DNS zone for the virtual network.
78
+
1. Use the following command to check whether an A record exists in the private Domain Name Server (DNS) zone for the virtual network.
79
79
80
80
```azurecli
81
81
az network private-dns record-set list -z privatelink.api.azureml.ms -o tsv --query [].name
0 commit comments