Skip to content

Commit 0856e1f

Browse files
authored
Merge pull request #105681 from liakaz/liakaz/profiling_documentation
Profiling documentation
2 parents 4b93b7a + 2432520 commit 0856e1f

File tree

4 files changed

+137
-60
lines changed

4 files changed

+137
-60
lines changed

articles/machine-learning/concept-model-management-and-deployment.md

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -66,6 +66,11 @@ Registered models are identified by name and version. Each time you register a m
6666
You can't delete a registered model that is being used in an active deployment.
6767
For more information, see the register model section of [Deploy models](how-to-deploy-and-where.md#registermodel).
6868

69+
### Profile models
70+
71+
Azure Machine Learning can help you understand the CPU and memory requirements of the service that will be created when you deploy your model. Profiling tests the service that runs your model and returns information such as the CPU usage, memory usage, and response latency. It also provides a CPU and memory recommendation based on the resource usage.
72+
For more information, see the profiling section of [Deploy models](how-to-deploy-and-where.md#profilemodel).
73+
6974
### Package and debug models
7075

7176
Before deploying a model into production, it is packaged into a Docker image. In most cases, image creation happens automatically in the background during deployment. You can manually specify the image.
@@ -74,10 +79,6 @@ If you run into problems with the deployment, you can deploy on your local devel
7479

7580
For more information, see [Deploy models](how-to-deploy-and-where.md#registermodel) and [Troubleshooting deployments](how-to-troubleshoot-deployment.md).
7681

77-
### Validate and profile models
78-
79-
Azure Machine Learning can use profiling to determine the ideal CPU and memory settings to use when deploying your model. Model validation happens as part of this process, using data that you supply for the profiling process.
80-
8182
### Convert and optimize models
8283

8384
Converting your model to [Open Neural Network Exchange](https://onnx.ai) (ONNX) may improve performance. On average, converting to ONNX can yield a 2x performance increase.

articles/machine-learning/how-to-deploy-and-where.md

Lines changed: 130 additions & 54 deletions
Original file line numberDiff line numberDiff line change
@@ -155,22 +155,16 @@ For more information on working with models trained outside Azure Machine Learni
155155

156156
<a name="target"></a>
157157

158-
## Choose a compute target
159-
160-
You can use the following compute targets, or compute resources, to host your web service deployment:
161-
162-
[!INCLUDE [aml-compute-target-deploy](../../includes/aml-compute-target-deploy.md)]
163-
164158
## Single versus multi-model endpoints
165159
Azure ML supports deploying single or multiple models behind a single endpoint.
166160

167161
Multi-model endpoints use a shared container to host multiple models. This helps to reduce overhead costs, improves utilization and enables you to chain modules together into ensembles. Models you specify in your deployment script are mounted and made available on the disk of the serving container - you can load them into memory on demand and score based on the specific model being requested at scoring time.
168162

169163
For an E2E example which shows how to use multiple models behind a single containerized endpoint, see [this example](https://github.com/Azure/MachineLearningNotebooks/tree/master/how-to-use-azureml/deployment/deploy-multi-model)
170164

171-
## Prepare deployment artifacts
165+
## Prepare to deploy
172166

173-
To deploy the model, you need the following:
167+
To deploy the model as a service, you need the following components:
174168

175169
* **Entry script & source code dependencies**. This script accepts requests, scores the requests by using the model, and returns the results.
176170

@@ -183,11 +177,9 @@ To deploy the model, you need the following:
183177
>
184178
> An alternative that might work for your scenario is [batch prediction](how-to-use-parallel-run-step.md), which does provide access to data stores during scoring.
185179

186-
* **Inference environment**. The base image with installed package dependencies required to run the model.
187-
188-
* **Deployment configuration** for the compute target that hosts the deployed model. This configuration describes things like memory and CPU requirements needed to run the model.
180+
* **Inference configuration**. Inference configuration specifies the the environment configuration, entry script, and other components needed to run the model as a service.
189181

190-
These items are encapsulated into an *inference configuration* and a *deployment configuration*. The inference configuration references the entry script and other dependencies. You define these configurations programmatically when you use the SDK to perform the deployment. You define them in JSON files when you use the CLI.
182+
Once you have the necessary components, you can profile the service that will be created as a result of deploying your model to understand its CPU and memory requirements.
191183

192184
### <a id="script"></a> 1. Define your entry script and dependencies
193185

@@ -263,33 +255,7 @@ These types are currently supported:
263255
* `pyspark`
264256
* Standard Python object
265257

266-
To use schema generation, include the `inference-schema` package in your Conda environment file. For more information on this package, see [https://github.com/Azure/InferenceSchema](https://github.com/Azure/InferenceSchema).
267-
268-
##### Example dependencies file
269-
270-
The following YAML is an example of a Conda dependencies file for inference. Please note that you must indicate azureml-defaults with verion >= 1.0.45 as a pip dependency, because it contains the functionality needed to host the model as a web service.
271-
272-
```YAML
273-
name: project_environment
274-
dependencies:
275-
- python=3.6.2
276-
- scikit-learn=0.20.0
277-
- pip:
278-
# You must list azureml-defaults as a pip dependency
279-
- azureml-defaults>=1.0.45
280-
- inference-schema[numpy-support]
281-
```
282-
283-
> [!IMPORTANT]
284-
> If your dependency is available through both Conda and pip (from PyPi), Microsoft recommends using the Conda version, as Conda packages typically come with pre-built binaries that make installation more reliable.
285-
>
286-
> For more information, see [Understanding Conda and Pip](https://www.anaconda.com/understanding-conda-and-pip/).
287-
>
288-
> To check if your dependency is available through Conda, use the `conda search <package-name>` command, or use the package indexes at [https://anaconda.org/anaconda/repo](https://anaconda.org/anaconda/repo) and [https://anaconda.org/conda-forge/repo](https://anaconda.org/conda-forge/repo).
289-
290-
If you want to use automatic schema generation, your entry script must import the `inference-schema` packages.
291-
292-
Define the input and output sample formats in the `input_sample` and `output_sample` variables, which represent the request and response formats for the web service. Use these samples in the input and output function decorators on the `run()` function. The following scikit-learn example uses schema generation.
258+
To use schema generation, include the `inference-schema` package in your dependencies file. For more information on this package, see [https://github.com/Azure/InferenceSchema](https://github.com/Azure/InferenceSchema). Define the input and output sample formats in the `input_sample` and `output_sample` variables, which represent the request and response formats for the web service. Use these samples in the input and output function decorators on the `run()` function. The following scikit-learn example uses schema generation.
293259

294260
##### Example entry script
295261

@@ -481,32 +447,60 @@ def run(request):
481447
> pip install azureml-contrib-services
482448
> ```
483449

484-
### 2. Define your inference environment
450+
### 2. Define your inference configuration
451+
452+
Inference configuration describes how to set up the web-service containing your model. It is not a part of your entry script. It references your entry script and is used to locate all the resources required by the deployment. It's used later, when you deploy the model.
453+
454+
Inference configuration uses Azure Machine Learning environments to define the software dependencies needed for your deployment. Environments allow you to create, manage, and reuse the software dependencies required for training and deployment. You can create an environment from custom dependency files or use one of the curated Azure Machine Learning environments. The following YAML is an example of a Conda dependencies file for inference. Please note that you must indicate azureml-defaults with verion >= 1.0.45 as a pip dependency, because it contains the functionality needed to host the model as a web service. If you want to use automatic schema generation, your entry script must also import the `inference-schema` packages.
455+
456+
```YAML
457+
name: project_environment
458+
dependencies:
459+
- python=3.6.2
460+
- scikit-learn=0.20.0
461+
- pip:
462+
# You must list azureml-defaults as a pip dependency
463+
- azureml-defaults>=1.0.45
464+
- inference-schema[numpy-support]
465+
```
466+
467+
> [!IMPORTANT]
468+
> If your dependency is available through both Conda and pip (from PyPi), Microsoft recommends using the Conda version, as Conda packages typically come with pre-built binaries that make installation more reliable.
469+
>
470+
> For more information, see [Understanding Conda and Pip](https://www.anaconda.com/understanding-conda-and-pip/).
471+
>
472+
> To check if your dependency is available through Conda, use the `conda search <package-name>` command, or use the package indexes at [https://anaconda.org/anaconda/repo](https://anaconda.org/anaconda/repo) and [https://anaconda.org/conda-forge/repo](https://anaconda.org/conda-forge/repo).
473+
474+
You can use the dependencies file to create an environment object and save it to your workspace for future use:
475+
476+
```python
477+
from azureml.core.environment import Environment
478+
485479

486-
The inference configuration describes how to configure the model to make predictions. This configuration isn't part of your entry script. It references your entry script and is used to locate all the resources required by the deployment. It's used later, when you deploy the model.
480+
myenv = Environment.from_conda_specification(name = 'myenv',
481+
file_path = 'path-to-conda-specification-file'
482+
myenv.register(workspace=ws)
483+
```
487484

488-
Inference configuration uses Azure Machine Learning environments to define the software dependencies needed for your deployment. Environments allow you to create, manage, and reuse the software dependencies required for training and deployment. The following example demonstrates loading an environment from your workspace and then using it with the inference configuration:
485+
The following example demonstrates loading an environment from your workspace and then using it with the inference configuration:
489486

490487
```python
491488
from azureml.core.environment import Environment
492489
from azureml.core.model import InferenceConfig
493490

494-
myenv = Environment.get(workspace=ws, name="myenv", version="1")
495-
inference_config = InferenceConfig(entry_script="x/y/score.py",
491+
492+
myenv = Environment.get(workspace=ws, name='myenv', version='1')
493+
inference_config = InferenceConfig(entry_script='path-to-score.py',
496494
environment=myenv)
497495
```
498496

499497
For more information on environments, see [Create and manage environments for training and deployment](how-to-use-environments.md).
500498

501-
You can also directly specify the dependencies without using an environment. The following example demonstrates how to create an inference configuration that loads software dependencies from a Conda file:
502-
503-
For more information on environments, see [Create and manage environments for training and deployment](how-to-use-environments.md).
504-
505499
For more information on inference configuration, see the [InferenceConfig](https://docs.microsoft.com/python/api/azureml-core/azureml.core.model.inferenceconfig?view=azure-ml-py) class documentation.
506500

507501
For information on using a custom Docker image with an inference configuration, see [How to deploy a model using a custom Docker image](how-to-deploy-custom-docker-image.md).
508502

509-
### CLI example of InferenceConfig
503+
#### CLI example of InferenceConfig
510504

511505
[!INCLUDE [inference config](../../includes/machine-learning-service-inference-config.md)]
512506

@@ -524,7 +518,93 @@ In this example, the configuration specifies the following settings:
524518

525519
For information on using a custom Docker image with an inference configuration, see [How to deploy a model using a custom Docker image](how-to-deploy-custom-docker-image.md).
526520

527-
### 3. Define your deployment configuration
521+
### <a id="profilemodel"></a> 3. Profile your model to determine resource utilization
522+
523+
Once you have registered your model and prepared the other components necessary for its deployment, you can determine the CPU and memory the deployed service will need. Profiling tests the service that runs your model and returns information such as the CPU usage, memory usage, and response latency. It also provides a recommendation for the CPU and memory based on resource usage.
524+
525+
In order to profile your model you will need:
526+
* A registered model.
527+
* An inference configuration based on your entry script and inference environment definition.
528+
* A single column tabular dataset, where each row contains a string representing sample request data.
529+
530+
> [!IMPORTANT]
531+
> At this point we only support profiling of services that expect their request data to be a string, for example: string serialized json, text, string serialized image, etc. The content of each row of the dataset (string) will be put into the body of the HTTP request and sent to the service encapsulating the model for scoring.
532+
533+
Below is an example of how you can construct an input dataset to profile a service which expects its incoming request data to contain serialized json. In this case we created a dataset based one hundred instances of the same request data content. In real world scenarios we suggest that you use larger datasets containing various inputs, especially if your model resource usage/behavior is input dependent.
534+
535+
```python
536+
import json
537+
from azureml.core import Datastore
538+
from azureml.core.dataset import Dataset
539+
from azureml.data import dataset_type_definitions
540+
541+
input_json = {'data': [[1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
542+
[10, 9, 8, 7, 6, 5, 4, 3, 2, 1]]}
543+
# create a string that can be utf-8 encoded and
544+
# put in the body of the request
545+
serialized_input_json = json.dumps(input_json)
546+
dataset_content = []
547+
for i in range(100):
548+
dataset_content.append(serialized_input_json)
549+
dataset_content = '\n'.join(dataset_content)
550+
file_name = 'sample_request_data.txt'
551+
f = open(file_name, 'w')
552+
f.write(dataset_content)
553+
f.close()
554+
555+
# upload the txt file created above to the Datastore and create a dataset from it
556+
data_store = Datastore.get_default(ws)
557+
data_store.upload_files(['./' + file_name], target_path='sample_request_data')
558+
datastore_path = [(data_store, 'sample_request_data' +'/' + file_name)]
559+
sample_request_data = Dataset.Tabular.from_delimited_files(
560+
datastore_path, separator='\n',
561+
infer_column_types=True,
562+
header=dataset_type_definitions.PromoteHeadersBehavior.NO_HEADERS)
563+
sample_request_data = sample_request_data.register(workspace=ws,
564+
name='sample_request_data',
565+
create_new_version=True)
566+
```
567+
568+
Once you have the dataset containing sample request data ready, create an inference configuration. Inference configuration is based on the score.py and the environment definition. The following example demonstrates how to create the inference configuration and run profiling:
569+
570+
```python
571+
from azureml.core.model import InferenceConfig, Model
572+
from azureml.core.dataset import Dataset
573+
574+
575+
model = Model(ws, id=model_id)
576+
inference_config = InferenceConfig(entry_script='path-to-score.py',
577+
environment=myenv)
578+
input_dataset = Dataset.get_by_name(workspace=ws, name='sample_request_data')
579+
profile = Model.profile(ws,
580+
'unique_name',
581+
[model],
582+
inference_config,
583+
input_dataset=input_dataset)
584+
585+
profile.wait_for_completion(True)
586+
587+
# see the result
588+
details = profile.get_details()
589+
```
590+
591+
The following command demonstrates how to profile a model by using the CLI:
592+
593+
```azurecli-interactive
594+
az ml model profile -g <resource-group-name> -w <workspace-name> --inference-config-file <path-to-inf-config.json> -m <model-id> --idi <input-dataset-id> -n <unique-name>
595+
```
596+
597+
## Deploy to target
598+
599+
Deployment uses the inference configuration deployment configuration to deploy the models. The deployment process is similar regardless of the compute target. Deploying to AKS is slightly different because you must provide a reference to the AKS cluster.
600+
601+
### Choose a compute target
602+
603+
You can use the following compute targets, or compute resources, to host your web service deployment:
604+
605+
[!INCLUDE [aml-compute-target-deploy](../../includes/aml-compute-target-deploy.md)]
606+
607+
### Define your deployment configuration
528608

529609
Before deploying your model, you must define the deployment configuration. *The deployment configuration is specific to the compute target that will host the web service.* For example, when you deploy a model locally, you must specify the port where the service accepts requests. The deployment configuration isn't part of your entry script. It's used to define the characteristics of the compute target that will host the model and entry script.
530610

@@ -544,10 +624,6 @@ The classes for local, Azure Container Instances, and AKS web services can be im
544624
from azureml.core.webservice import AciWebservice, AksWebservice, LocalWebservice
545625
```
546626

547-
## Deploy to target
548-
549-
Deployment uses the inference configuration deployment configuration to deploy the models. The deployment process is similar regardless of the compute target. Deploying to AKS is slightly different because you must provide a reference to the AKS cluster.
550-
551627
### Securing deployments with SSL
552628

553629
For more information on how to secure a web service deployment, see [Use SSL to secure a web service](how-to-secure-web-service.md#enable).

articles/machine-learning/how-to-deploy-custom-docker-image.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -43,7 +43,7 @@ This document is broken into two sections:
4343
* The [Azure CLI](https://docs.microsoft.com/cli/azure/install-azure-cli?view=azure-cli-latest).
4444
* The [CLI extension for Azure Machine Learning](reference-azure-machine-learning-cli.md).
4545
* An [Azure Container Registry](/azure/container-registry) or other Docker registry that is accessible on the internet.
46-
* The steps in this document assume that you are familiar with creating and using an __inference configuration__ object as part of model deployment. For more information, see the "prepare to deploy" section of [Where to deploy and how](how-to-deploy-and-where.md#prepare-deployment-artifacts).
46+
* The steps in this document assume that you are familiar with creating and using an __inference configuration__ object as part of model deployment. For more information, see the "prepare to deploy" section of [Where to deploy and how](how-to-deploy-and-where.md#prepare-to-deploy).
4747

4848
## Create a custom base image
4949

articles/machine-learning/tutorial-train-deploy-model-cli.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -376,7 +376,7 @@ This command deploys a new service named `myservice`, using version 1 of the mod
376376

377377
The `inferenceConfig.yml` file provides information on how to use the model for inference. For example, it references the entry script (`score.py`) and software dependencies.
378378

379-
For more information on the structure of this file, see the [Inference configuration schema](reference-azure-machine-learning-cli.md#inference-configuration-schema). For more information on entry scripts, see [Deploy models with the Azure Machine Learning](how-to-deploy-and-where.md#prepare-deployment-artifacts).
379+
For more information on the structure of this file, see the [Inference configuration schema](reference-azure-machine-learning-cli.md#inference-configuration-schema). For more information on entry scripts, see [Deploy models with the Azure Machine Learning](how-to-deploy-and-where.md#prepare-to-deploy).
380380

381381
The `aciDeploymentConfig.yml` describes the deployment environment used to host the service. The deployment configuration is specific to the compute type that you use for the deployment. In this case, an Azure Container Instance is used. For more information, see the [Deployment configuration schema](reference-azure-machine-learning-cli.md#deployment-configuration-schema).
382382

0 commit comments

Comments
 (0)