You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This article shows how to profile a machine learning to model to determine how much CPU and memory you will need to allocate for the model when deploying it as a web service.
21
+
This article shows how to profile a machine learning to model to determine how much CPU and memory you need to allocate for the model when deploying it as a web service.
22
22
23
23
> [!IMPORTANT]
24
-
> This article applies to CLI v1 and SDK v1. This profiling technique is not available for v2 of either CLI or SDK.
24
+
> This article applies to CLI v1 and SDK v1. This profiling technique isn't available for v2 of either CLI or SDK.
This article assumes you have trained and registered a model with Azure Machine Learning. See the [sample tutorial here](../how-to-train-scikit-learn.md) for an example of training and registering a scikit-learn model with Azure Machine Learning.
30
+
This article assumes you train and register a model with Azure Machine Learning. See the [sample tutorial here](how-to-train-scikit-learn.md) for an example of training and registering a scikit-learn model with Azure Machine Learning.
31
31
32
32
## Limitations
33
33
34
-
* Profiling will not work when the Azure Container Registry (ACR) for your workspace is behind a virtual network.
34
+
* Profiling doesn't work when the Azure Container Registry (ACR) for your workspace is behind a virtual network.
35
35
36
36
## Run the profiler
37
37
38
-
Once you have registered your model and prepared the other components necessary for its deployment, you can determine the CPU and memory the deployed service will need. Profiling tests the service that runs your model and returns information such as the CPU usage, memory usage, and response latency. It also provides a recommendation for the CPU and memory based on resource usage.
38
+
Once you register your model and prepared the other components necessary for its deployment, you can determine the CPU and memory the deployed service need. Profiling tests the service that runs your model and returns information such as the CPU usage, memory usage, and response latency. It also provides a recommendation for the CPU and memory based on resource usage.
39
39
40
-
In order to profile your model, you will need:
40
+
In order to profile your model, you need:
41
41
* A registered model.
42
42
* An inference configuration based on your entry script and inference environment definition.
43
43
* A single column tabular dataset, where each row contains a string representing sample request data.
44
44
45
45
> [!IMPORTANT]
46
-
> At this point we only support profiling of services that expect their request data to be a string, for example: string serialized json, text, string serialized image, etc. The content of each row of the dataset (string) will be put into the body of the HTTP request and sent to the service encapsulating the model for scoring.
46
+
> Azure Machine Learning only supports profiling of services that expect their request data to be a string, for example: string serialized json, text, string serialized image, etc. The content of each row of the dataset (string) is put into the body of the HTTP request and sent to the service encapsulating the model for scoring.
47
47
48
48
> [!IMPORTANT]
49
49
> We only support profiling up to 2 CPUs in ChinaEast2 and USGovArizona region.
50
50
51
-
Below is an example of how you can construct an input dataset to profile a service that expects its incoming request data to contain serialized json. In this case, we created a dataset based 100 instances of the same request data content. In real world scenarios we suggest that you use larger datasets containing various inputs, especially if your model resource usage/behavior is input dependent.
51
+
The following is an example of how you can construct an input dataset to profile a service that expects its incoming request data to contain serialized json. In this case, we created a dataset based 100 instances of the same request data content. In real world scenarios we suggest that you use larger datasets containing various inputs, especially if your model resource usage/behavior is input dependent.
0 commit comments