You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/machine-learning/how-to-safely-rollout-online-endpoints.md
+19-14Lines changed: 19 additions & 14 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -179,34 +179,39 @@ Online endpoints are used for online (real-time) inferencing. Online endpoints c
179
179
180
180
### Define an endpoint
181
181
182
-
To define an endpoint, you need to specify the following key attributes:
182
+
The following table lists the key attributes and some optional ones to specify when you define an endpoint.
183
183
184
-
* Endpoint name: The name of the endpoint. It must be unique in the Azure region. For more information on the naming rules, see [managed online endpoint limits](how-to-manage-quotas.md#azure-machine-learning-managed-online-endpoints).
185
-
* Authentication mode: The authentication method for the endpoint. Choose between key-based authentication `key` and Azure Machine Learning token-based authentication `aml_token`. A key doesn't expire, but a token does expire. For more information on authenticating, see [Authenticate to an online endpoint](how-to-authenticate-online-endpoint.md).
186
-
* Optionally, you can add a description and tags to your endpoint.
| Name | The name of the endpoint. It must be unique in the Azure region. For more information on the naming rules, see [managed online endpoint limits](how-to-manage-quotas.md#azure-machine-learning-managed-online-endpoints). |
187
+
| Authentication mode | The authentication method for the endpoint. Choose between key-based authentication `key` and Azure Machine Learning token-based authentication `aml_token`. A key doesn't expire, but a token does expire. For more information on authenticating, see [Authenticate to an online endpoint](how-to-authenticate-online-endpoint.md). |
188
+
| Description (Optional) | Description of the endpoint. |
189
+
| Tags (Optional) | Dictionary of tags for the endpoint. |
190
+
| Traffic (optional) | Traffic rules on how the traffic will be routed across deployments. Represent the traffic as a dictionary of key-value pairs, where key represents the deployment name and value represents the percentage of traffic to that deployment. You can set the traffic only when the deployments under an endpoint have been created. You can also update the traffic for an online endpoint after the deployments have been created. For more information on how to use mirrored traffic, see [Allocate a small percentage of live traffic to the new deployment](#allocate-a-small-percentage-of-live-traffic-to-the-new-deployment). |
191
+
| Mirror traffic (optional) | Percentage of live traffic to mirror to a deployment. For more information on how to use mirrored traffic, see [Test the deployment with mirrored traffic (preview)](#test-the-deployment-with-mirrored-traffic-preview). |
187
192
188
-
### Define a deployment
193
+
To see a full list of attributes that you can specify when you create an endpoint, see [CLI (v2) online endpoint YAML schema](/azure/machine-learning/reference-yaml-endpoint-online) and [ManagedOnlineEndpoint Class SDK(v2)](/python/api/azure-ai-ml/azure.ai.ml.entities.managedonlineendpoint?view=azure-python).
189
194
190
-
A *deployment* is a set of resources required for hosting the model that does the actual inferencing. To deploy a model, you must have:
195
+
### Define a deployment
191
196
192
-
- Model files (or the name and version of a model that's already registered in your workspace). In the example, we have a scikit-learn model that does regression.
193
-
- A scoring script, that is, code that executes the model on a given input request. The scoring script receives data submitted to a deployed web service and passes it to the model. The script then executes the model and returns its response to the client. The scoring script is specific to your model and must understand the data that the model expects as input and returns as output. In this example, we have a *score.py* file.
194
-
- An environment in which your model runs. The environment can be a Docker image with Conda dependencies or a Dockerfile.
195
-
- Settings to specify the instance type and scaling capacity.
197
+
A *deployment* is a set of resources required for hosting the model that does the actual inferencing. The following table describes the key attributes of a deployment.
196
198
197
-
The following table describes the key attributes of a deployment:
| Endpoint name | The name of the endpoint to create the deployment under. |
203
-
| Model | The model to use for the deployment. This value can be either a reference to an existing versioned model in the workspace or an inline model specification. |
204
+
| Model | The model to use for the deployment. This value can be either a reference to an existing versioned model in the workspace or an inline model specification. In the example, we have a scikit-learn model that does regression. |
204
205
| Code path | The path to the directory on the local development environment that contains all the Python source code for scoring the model. You can use nested directories and packages. |
205
-
| Scoring script | The relative path to the scoring file in the source code directory. This Python code must have an `init()` function and a `run()` function. The `init()` function will be called after the model is created or updated (you can use it to cache the model in memory, for example). The `run()` function is called at every invocation of the endpoint to do the actual scoring and prediction. |
206
-
| Environment | The environment to host the model and code. This value can be either a reference to an existing versioned environment in the workspace or an inline environment specification. |
206
+
| Scoring script | Python code that executes the model on a given input request. The scoring script receives data submitted to a deployed web service and passes it to the model. The script then executes the model and returns its response to the client. The scoring script is specific to your model and must understand the data that the model expects as input and returns as output. In this example, we have a *score.py* file. This Python code must have an `init()` function and a `run()` function. The `init()` function will be called after the model is created or updated (you can use it to cache the model in memory, for example). The `run()` function is called at every invocation of the endpoint to do the actual scoring and prediction.
207
+
You can provide the relative path to the scoring file in the source code directory. |
208
+
| Environment | The environment to host the model and code. This value can be either a reference to an existing versioned environment in the workspace or an inline environment specification. The environment can be a Docker image with Conda dependencies, a Dockerfile, or a registered environment. |
207
209
| Instance type | The VM size to use for the deployment. For the list of supported sizes, see [Managed online endpoints SKU list](reference-managed-online-endpoints-vm-sku-list.md). |
208
210
| Instance count | The number of instances to use for the deployment. Base the value on the workload you expect. For high availability, we recommend that you set the value to at least `3`. We reserve an extra 20% for performing upgrades. For more information, see [managed online endpoint quotas](how-to-manage-quotas.md#azure-machine-learning-managed-online-endpoints). |
209
211
212
+
To see a full list of attributes that you can specify when you create a deployment, see [CLI (v2) managed online deployment YAML schema](/azure/machine-learning/reference-yaml-deployment-managed-online) and
213
+
[ManagedOnlineDeployment Class SDK(v2)](/python/api/azure-ai-ml/azure.ai.ml.entities.managedonlinedeployment?view=azure-python).
0 commit comments