Skip to content

Commit d02a5b4

Browse files
author
Larry Franks
committed
trying to get conversational tone a higher score
1 parent 36ec1c2 commit d02a5b4

File tree

1 file changed

+12
-12
lines changed

1 file changed

+12
-12
lines changed

articles/machine-learning/how-to-troubleshoot-deployment.md

Lines changed: 12 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ Learn how to troubleshoot and solve, or work around, common Docker deployment er
1919

2020
## Prerequisites
2121

22-
* An **Azure subscription**. If you do not have one, try the [free or paid version of Azure Machine Learning](https://aka.ms/AMLFree).
22+
* An **Azure subscription**. Try the [free or paid version of Azure Machine Learning](https://aka.ms/AMLFree).
2323
* The [Azure Machine Learning SDK](https://docs.microsoft.com/python/api/overview/azure/ml/install?view=azure-ml-py&preserve-view=true).
2424
* The [Azure CLI](https://docs.microsoft.com/cli/azure/install-azure-cli?view=azure-cli-latest).
2525
* The [CLI extension for Azure Machine Learning](reference-azure-machine-learning-cli.md).
@@ -29,14 +29,12 @@ Learn how to troubleshoot and solve, or work around, common Docker deployment er
2929

3030
## Steps for Docker deployment of machine learning models
3131

32-
When deploying a model in Azure Machine Learning, the system performs a number of tasks.
33-
34-
The recommended approach for model deployment is via the [Model.deploy()](https://docs.microsoft.com/python/api/azureml-core/azureml.core.model%28class%29?view=azure-ml-py#&preserve-view=truedeploy-workspace--name--models--inference-config-none--deployment-config-none--deployment-target-none--overwrite-false-) API using an [Environment](how-to-use-environments.md) object as an input parameter. In this case, the service creates a base docker image during deployment stage and mounts the required models all in one call. The basic deployment tasks are:
32+
When deploying a model in Azure Machine Learning, you use the [Model.deploy()](https://docs.microsoft.com/python/api/azureml-core/azureml.core.model%28class%29?view=azure-ml-py#&preserve-view=truedeploy-workspace--name--models--inference-config-none--deployment-config-none--deployment-target-none--overwrite-false-) API and an [Environment](how-to-use-environments.md) object. The service creates a base docker image during deployment stage and mounts the required models all in one call. The basic deployment tasks are:
3533

3634
1. Register the model in the workspace model registry.
3735

3836
2. Define Inference Configuration:
39-
1. Create an [Environment](how-to-use-environments.md) object based on the dependencies you specify in the environment yaml file or use one of our procured environments.
37+
1. Create an [Environment](how-to-use-environments.md) object. This object can use the dependencies in an environment yaml file, one of our curated environments.
4038
2. Create an inference configuration (InferenceConfig object) based on the environment and the scoring script.
4139

4240
3. Deploy the model to Azure Container Instance (ACI) service or to Azure Kubernetes Service (AKS).
@@ -47,7 +45,7 @@ Learn more about this process in the [Model Management](concept-model-management
4745

4846
If you run into any issue, the first thing to do is to break down the deployment task (previous described) into individual steps to isolate the problem.
4947

50-
Assuming you are using the new/recommended deployment method via [Model.deploy()](https://docs.microsoft.com/python/api/azureml-core/azureml.core.model%28class%29?view=azure-ml-py#&preserve-view=truedeploy-workspace--name--models--inference-config-none--deployment-config-none--deployment-target-none--overwrite-false-) API with an [Environment](how-to-use-environments.md) object as an input parameter, your code can be broken down into three major steps:
48+
When using [Model.deploy()](https://docs.microsoft.com/python/api/azureml-core/azureml.core.model%28class%29?view=azure-ml-py#&preserve-view=truedeploy-workspace--name--models--inference-config-none--deployment-config-none--deployment-target-none--overwrite-false-) with an [Environment](how-to-use-environments.md) object as an input parameter, your code can be broken down into three major steps:
5149

5250
1. Register the model. Here is some sample code:
5351

@@ -90,11 +88,11 @@ Assuming you are using the new/recommended deployment method via [Model.deploy()
9088
aci_service.wait_for_deployment(show_output=True)
9189
```
9290

93-
Once you have broken down the deployment process into individual tasks, we can look at some of the most common errors.
91+
Breaking thee deployment process into individual tasks makes it easier to identify some of the more common errors.
9492

9593
## Debug locally
9694

97-
If you encounter problems deploying a model to ACI or AKS, try deploying it as a local web service. Using a local web service makes it easier to troubleshoot problems. The Docker image containing the model is downloaded and started on your local system.
95+
If you have problems when deploying a model to ACI or AKS, deploy it as a local web service. Using a local web service makes it easier to troubleshoot problems.
9896

9997
You can find a sample [local deployment notebook](https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/deployment/deploy-to-local/register-model-deploy-local.ipynb) in the [MachineLearningNotebooks](https://github.com/Azure/MachineLearningNotebooks) repo to explore a runnable example.
10098

@@ -123,9 +121,9 @@ service.wait_for_deployment(True)
123121
print(service.port)
124122
```
125123

126-
If you are defining your own conda specification YAML, you must list azureml-defaults with version >= 1.0.45 as a pip dependency. This package contains the functionality needed to host the model as a web service.
124+
If you are defining your own conda specification YAML, list azureml-defaults version >= 1.0.45 as a pip dependency. This package is needed to host the model as a web service.
127125

128-
At this point, you can work with the service as normal. For example, the following code demonstrates sending data to the service:
126+
At this point, you can work with the service as normal. The following code demonstrates sending data to the service:
129127

130128
```python
131129
import json
@@ -184,7 +182,7 @@ You can address the error by increasing the value of `memory_gb` in `deployment_
184182

185183
## Container cannot be scheduled
186184

187-
When deploying a service to an Azure Kubernetes Service compute target, Azure Machine Learning will attempt to schedule the service with the requested amount of resources. If after 5 minutes, there are no nodes available in the cluster with the appropriate amount of resources available, the deployment will fail with the message `Couldn't Schedule because the kubernetes cluster didn't have available resources after trying for 00:05:00`. You can address this error by either adding more nodes, changing the SKU of your nodes or changing the resource requirements of your service.
185+
When deploying a service to an Azure Kubernetes Service compute target, Azure Machine Learning will attempt to schedule the service with the requested amount of resources. If there are no nodes available in the cluster with the appropriate amount of resources after 5 minutes, the deployment will fail. The failure message is `Couldn't Schedule because the kubernetes cluster didn't have available resources after trying for 00:05:00`. You can address this error by either adding more nodes, changing the SKU of your nodes, or changing the resource requirements of your service.
188186
189187
The error message will typically indicate which resource you need more of - for instance, if you see an error message indicating `0/3 nodes are available: 3 Insufficient nvidia.com/gpu` that means that the service requires GPUs and there are three nodes in the cluster that do not have available GPUs. This could be addressed by adding more nodes if you are using a GPU SKU, switching to a GPU enabled SKU if you are not or changing your environment to not require GPUs.
190188
@@ -284,7 +282,9 @@ You can increase the timeout or try to speed up the service by modifying the sco
284282

285283
## Advanced debugging
286284

287-
In some cases, you may need to interactively debug the Python code contained in your model deployment. For example, if the entry script is failing and the reason cannot be determined by additional logging. By using Visual Studio Code and the debugpy, you can attach to the code running inside the Docker container. For more information, visit the [interactive debugging in VS Code guide](how-to-debug-visual-studio-code.md#debug-and-troubleshoot-deployments).
285+
You may need to interactively debug the Python code contained in your model deployment. For example, if the entry script is failing and the reason cannot be determined by additional logging. By using Visual Studio Code and the debugpy, you can attach to the code running inside the Docker container.
286+
287+
For more information, visit the [interactive debugging in VS Code guide](how-to-debug-visual-studio-code.md#debug-and-troubleshoot-deployments).
288288

289289
## Next steps
290290

0 commit comments

Comments
 (0)