Skip to content

Commit cbbdaab

Browse files
authored
Merge pull request #6155 from vizhur/main
Remove model package references
2 parents 116d12a + a2ff5cd commit cbbdaab

13 files changed

+26
-1152
lines changed

.openpublishing.redirection.json

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -489,6 +489,26 @@
489489
"source_path": "articles/ai-foundry/model-inference/index.yml",
490490
"redirect_url": "../foundry-models/index",
491491
"redirect_document_id": false
492+
},
493+
{
494+
"source_path": "articles/machine-learning/concept-package-models.md",
495+
"redirect_url": "/azure/machine-learning/concept-endpoints",
496+
"redirect_document_id": false
497+
},
498+
{
499+
"source_path": "articles/machine-learning/how-to-package-models-moe.md",
500+
"redirect_url": "/azure/machine-learning/concept-endpoints",
501+
"redirect_document_id": false
502+
},
503+
{
504+
"source_path": "articles/machine-learning/how-to-package-models-app-service.md",
505+
"redirect_url": "/azure/machine-learning/concept-endpoints",
506+
"redirect_document_id": false
507+
},
508+
{
509+
"source_path": "articles/machine-learning/how-to-package-models.md",
510+
"redirect_url": "/azure/machine-learning/concept-endpoints",
511+
"redirect_document_id": false
492512
}
493513
]
494514
}

articles/machine-learning/concept-endpoints.md

Lines changed: 3 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -124,7 +124,6 @@ The following table shows a summary of the different features available to stand
124124
| Deployment types | Models | Models | Models and Pipeline components |
125125
| MLflow model deployment | No, only specific models in the catalog | Yes | Yes |
126126
| Custom model deployment | No, only specific models in the catalog | Yes, with scoring script | Yes, with scoring script |
127-
| Model package deployment <sup>2</sup> | Built-in | Yes (preview) | No |
128127
| Inference server <sup>3</sup> | Azure AI Model Inference API | - Azure Machine Learning Inferencing Server<br /> - Triton<br /> - Custom (using BYOC) | Batch Inference |
129128
| Compute resource consumed | None (serverless) | Instances or granular resources | Cluster instances |
130129
| Compute type | None (serverless) | Managed compute and Kubernetes | Managed compute and Kubernetes |
@@ -135,13 +134,12 @@ The following table shows a summary of the different features available to stand
135134
| Cost basis<sup>5</sup> | Per token | Per deployment: compute instances running | Per job: compute instanced consumed in the job (capped to the maximum number of instances of the cluster) |
136135
| Local testing of deployments | No | Yes | No |
137136

138-
<sup>2</sup> Deploying MLflow models to endpoints without outbound internet connectivity or private networks requires [packaging the model](concept-package-models.md) first.
139137

140-
<sup>3</sup> *Inference server* refers to the serving technology that takes requests, processes them, and creates responses. The inference server also dictates the format of the input and the expected outputs.
138+
<sup>2</sup> *Inference server* refers to the serving technology that takes requests, processes them, and creates responses. The inference server also dictates the format of the input and the expected outputs.
141139

142-
<sup>4</sup> *Autoscaling* is the ability to dynamically scale up or scale down the deployment's allocated resources based on its load. Online and batch deployments use different strategies for autoscaling. While online deployments scale up and down based on the resource utilization (like CPU, memory, requests, etc.), batch endpoints scale up or down based on the number of jobs created.
140+
<sup>3</sup> *Autoscaling* is the ability to dynamically scale up or scale down the deployment's allocated resources based on its load. Online and batch deployments use different strategies for autoscaling. While online deployments scale up and down based on the resource utilization (like CPU, memory, requests, etc.), batch endpoints scale up or down based on the number of jobs created.
143141

144-
<sup>5</sup> Both online and batch deployments charge by the resources consumed. In online deployments, resources are provisioned at deployment time. In batch deployment, resources aren't consumed at deployment time but at the time that the job runs. Hence, there's no cost associated with the batch deployment itself. Likewise, queued jobs don't consume resources either.
142+
<sup>4</sup> Both online and batch deployments charge by the resources consumed. In online deployments, resources are provisioned at deployment time. In batch deployment, resources aren't consumed at deployment time but at the time that the job runs. Hence, there's no cost associated with the batch deployment itself. Likewise, queued jobs don't consume resources either.
145143

146144
## Developer interfaces
147145

articles/machine-learning/concept-package-models.md

Lines changed: 0 additions & 107 deletions
This file was deleted.

articles/machine-learning/how-to-deploy-mlflow-models-online-endpoints.md

Lines changed: 1 addition & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -36,8 +36,6 @@ For no-code-deployment, Azure Machine Learning:
3636
* The [`mlflow-skinny`](https://github.com/mlflow/mlflow/blob/master/skinny/README_SKINNY.md) package
3737
* A scoring script for inferencing
3838

39-
[!INCLUDE [mlflow-model-package-for-workspace-without-egress](includes/mlflow-model-package-for-workspace-without-egress.md)]
40-
4139
## Prerequisites
4240

4341
- An Azure subscription. If you don't have an Azure subscription, create a [free account](https://azure.microsoft.com/free/) before you begin.
@@ -419,19 +417,6 @@ version = registered_model.version
419417
)
420418
```
421419

422-
Alternatively, if your endpoint doesn't have egress connectivity, use [model packaging (preview)](how-to-package-models.md) by including the argument `with_package=True`:
423-
424-
```python
425-
blue_deployment = ManagedOnlineDeployment(
426-
name="blue",
427-
endpoint_name=endpoint_name,
428-
model=model,
429-
instance_type="Standard_F4s_v2",
430-
instance_count=1,
431-
with_package=True,
432-
)
433-
```
434-
435420
# [Python (MLflow SDK)](#tab/mlflow)
436421

437422
```python
@@ -473,10 +458,8 @@ version = registered_model.version
473458

474459
:::code language="azurecli" source="~/azureml-examples-main/cli/deploy-managed-online-endpoint-ncd.sh" ID="create_sklearn_deployment":::
475460

476-
If your endpoint doesn't have egress connectivity, use model packaging (preview) by including the flag `--package-model`:
477-
478461
```azurecli
479-
az ml online-deployment create --package-model --name sklearn-deployment --endpoint $ENDPOINT_NAME -f endpoints/online/ncd/sklearn-deployment.yaml --all-traffic
462+
az ml online-deployment create --name sklearn-deployment --endpoint $ENDPOINT_NAME -f endpoints/online/ncd/sklearn-deployment.yaml --all-traffic
480463
```
481464

482465
# [Python (Azure Machine Learning SDK)](#tab/sdk)

articles/machine-learning/how-to-deploy-mlflow-models-online-progressive.md

Lines changed: 0 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -319,19 +319,6 @@ So far, the endpoint is empty. There are no deployments on it. Let's create the
319319
instance_count=1,
320320
)
321321
```
322-
323-
If your endpoint doesn't have egress connectivity, use [model packaging (preview)](how-to-package-models.md) by including the argument `with_package=True`:
324-
325-
```python
326-
blue_deployment = ManagedOnlineDeployment(
327-
name=blue_deployment_name,
328-
endpoint_name=endpoint_name,
329-
model=model,
330-
instance_type="Standard_DS2_v2",
331-
instance_count=1,
332-
with_package=True,
333-
)
334-
```
335322
336323
# [Python (MLflow SDK)](#tab/mlflow)
337324

articles/machine-learning/how-to-deploy-mlflow-models.md

Lines changed: 2 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -32,8 +32,6 @@ For no-code deployment, Azure Machine Learning:
3232
- Packages required for Azure Machine Learning to perform inference, including [`mlflow-skinny`](https://github.com/mlflow/mlflow/blob/master/skinny/README_SKINNY.md).
3333
- A scoring script to perform inference.
3434

35-
[!INCLUDE [mlflow-model-package-for-workspace-without-egress](includes/mlflow-model-package-for-workspace-without-egress.md)]
36-
3735
### Packages and dependencies
3836

3937
Azure Machine Learning automatically generates environments to run inference on MLflow models. To build the environments, Azure Machine Learning reads the conda dependencies that are specified in the MLflow model and adds any packages that are required to run the inferencing server. These extra packages vary depending on deployment type.
@@ -238,11 +236,10 @@ Each tool has different capabilities, particularly for which type of compute it
238236
| Deploy to web services like Azure Container Instances or Azure Kubernetes Service (AKS) | Legacy support<sup>2</sup> | Not supported<sup>2</sup> |
239237
| Deploy to web services like Container Instances or AKS with a scoring script | Not supported<sup>3</sup> | Legacy support<sup>2</sup> |
240238

241-
<sup>1</sup> Deployment to online endpoints that are in workspaces with private link enabled requires you to [package models before deployment (preview)](how-to-package-models.md).
242239

243-
<sup>2</sup> Switch to [managed online endpoints](concept-endpoints.md) if possible.
240+
<sup>1</sup> Switch to [managed online endpoints](concept-endpoints.md) if possible.
244241

245-
<sup>3</sup> Open-source MLflow doesn't have the concept of a scoring script and doesn't support batch execution.
242+
<sup>2</sup> Open-source MLflow doesn't have the concept of a scoring script and doesn't support batch execution.
246243

247244
### Choose a deployment tool
248245

0 commit comments

Comments
 (0)