MicrosoftDocs
diff --git a/‎.openpublishing.redirection.json‎
Lines changed: 20 additions & 0 deletions b/‎.openpublishing.redirection.json‎
Lines changed: 20 additions & 0 deletions
diff --git a/‎articles/machine-learning/concept-endpoints.md‎
Lines changed: 3 additions & 5 deletions b/‎articles/machine-learning/concept-endpoints.md‎
Lines changed: 3 additions & 5 deletions
diff --git a/‎articles/machine-learning/concept-package-models.md‎
Lines changed: 0 additions & 107 deletions b/‎articles/machine-learning/concept-package-models.md‎
Lines changed: 0 additions & 107 deletions
diff --git a/‎articles/machine-learning/how-to-deploy-mlflow-models-online-endpoints.md‎
Lines changed: 1 addition & 18 deletions b/‎articles/machine-learning/how-to-deploy-mlflow-models-online-endpoints.md‎
Lines changed: 1 addition & 18 deletions
diff --git a/‎articles/machine-learning/how-to-deploy-mlflow-models-online-progressive.md‎
Lines changed: 0 additions & 13 deletions b/‎articles/machine-learning/how-to-deploy-mlflow-models-online-progressive.md‎
Lines changed: 0 additions & 13 deletions
diff --git a/‎articles/machine-learning/how-to-deploy-mlflow-models.md‎
Lines changed: 2 additions & 5 deletions b/‎articles/machine-learning/how-to-deploy-mlflow-models.md‎
Lines changed: 2 additions & 5 deletions
@@ -489,6 +489,26 @@
       "source_path": "articles/ai-foundry/model-inference/index.yml",
       "redirect_url": "../foundry-models/index",
       "redirect_document_id": false
+    },
+    {
+      "source_path": "articles/machine-learning/concept-package-models.md",
+      "redirect_url": "/azure/machine-learning/concept-endpoints",
+      "redirect_document_id": false
+    },
+    {
+      "source_path": "articles/machine-learning/how-to-package-models-moe.md",
+      "redirect_url": "/azure/machine-learning/concept-endpoints",
+      "redirect_document_id": false
+    },
+    {
+      "source_path": "articles/machine-learning/how-to-package-models-app-service.md",
+      "redirect_url": "/azure/machine-learning/concept-endpoints",
+      "redirect_document_id": false
+    },
+    {
+      "source_path": "articles/machine-learning/how-to-package-models.md",
+      "redirect_url": "/azure/machine-learning/concept-endpoints",
+      "redirect_document_id": false
     }
   ]
 }
@@ -124,7 +124,6 @@ The following table shows a summary of the different features available to stand
 | Deployment types              | Models                                          | Models                                          | Models and Pipeline components                |
 | MLflow model deployment       | No, only specific models in the catalog         | Yes                                             | Yes                                           |
 | Custom model deployment       | No, only specific models in the catalog         | Yes, with scoring script                        | Yes, with scoring script                      |
-| Model package deployment  <sup>2</sup>    | Built-in | Yes (preview)                                   | No                                            |
 | Inference server <sup>3</sup> | Azure AI Model Inference API                    | - Azure Machine Learning Inferencing Server<br /> - Triton<br /> - Custom (using BYOC)  | Batch Inference        |
 | Compute resource consumed     | None (serverless)                               | Instances or granular resources                 | Cluster instances                             |
 | Compute type                  | None (serverless)                               | Managed compute and Kubernetes                  | Managed compute and Kubernetes                |
@@ -135,13 +134,12 @@ The following table shows a summary of the different features available to stand
 | Cost basis<sup>5</sup>        | Per token                                      | Per deployment: compute instances running       | Per job: compute instanced consumed in the job  (capped to the maximum number of instances of the cluster) |
 | Local testing of deployments  | No                                              | Yes                                             | No                                            |
 
-<sup>2</sup> Deploying MLflow models to endpoints without outbound internet connectivity or private networks requires [packaging the model](concept-package-models.md) first.
 
-<sup>3</sup> *Inference server* refers to the serving technology that takes requests, processes them, and creates responses. The inference server also dictates the format of the input and the expected outputs.
+<sup>2</sup> *Inference server* refers to the serving technology that takes requests, processes them, and creates responses. The inference server also dictates the format of the input and the expected outputs.
 
-<sup>4</sup> *Autoscaling* is the ability to dynamically scale up or scale down the deployment's allocated resources based on its load. Online and batch deployments use different strategies for autoscaling. While online deployments scale up and down based on the resource utilization (like CPU, memory, requests, etc.), batch endpoints scale up or down based on the number of jobs created.
+<sup>3</sup> *Autoscaling* is the ability to dynamically scale up or scale down the deployment's allocated resources based on its load. Online and batch deployments use different strategies for autoscaling. While online deployments scale up and down based on the resource utilization (like CPU, memory, requests, etc.), batch endpoints scale up or down based on the number of jobs created.
 
-<sup>5</sup> Both online and batch deployments charge by the resources consumed. In online deployments, resources are provisioned at deployment time. In batch deployment, resources aren't consumed at deployment time but at the time that the job runs. Hence, there's no cost associated with the batch deployment itself. Likewise, queued jobs don't consume resources either.
+<sup>4</sup> Both online and batch deployments charge by the resources consumed. In online deployments, resources are provisioned at deployment time. In batch deployment, resources aren't consumed at deployment time but at the time that the job runs. Hence, there's no cost associated with the batch deployment itself. Likewise, queued jobs don't consume resources either.
 
 ## Developer interfaces
 
 
@@ -36,8 +36,6 @@ For no-code-deployment, Azure Machine Learning:
     * The [`mlflow-skinny`](https://github.com/mlflow/mlflow/blob/master/skinny/README_SKINNY.md) package
     * A scoring script for inferencing
 
-[!INCLUDE [mlflow-model-package-for-workspace-without-egress](includes/mlflow-model-package-for-workspace-without-egress.md)]
-
 ## Prerequisites
 
 - An Azure subscription. If you don't have an Azure subscription, create a [free account](https://azure.microsoft.com/free/) before you begin.
@@ -419,19 +417,6 @@ version = registered_model.version
     )
     ```
 
-    Alternatively, if your endpoint doesn't have egress connectivity, use [model packaging (preview)](how-to-package-models.md) by including the argument `with_package=True`:
-
-    ```python
-    blue_deployment = ManagedOnlineDeployment(
-        name="blue",
-        endpoint_name=endpoint_name,
-        model=model,
-        instance_type="Standard_F4s_v2",
-        instance_count=1,
-        with_package=True,
-    )
-    ```
-
     # [Python (MLflow SDK)](#tab/mlflow)
 
     ```python
@@ -473,10 +458,8 @@ version = registered_model.version
 
     :::code language="azurecli" source="~/azureml-examples-main/cli/deploy-managed-online-endpoint-ncd.sh" ID="create_sklearn_deployment":::
 
-    If your endpoint doesn't have egress connectivity, use model packaging (preview) by including the flag `--package-model`:
-
     ```azurecli
-    az ml online-deployment create --package-model --name sklearn-deployment --endpoint $ENDPOINT_NAME -f endpoints/online/ncd/sklearn-deployment.yaml --all-traffic
+    az ml online-deployment create --name sklearn-deployment --endpoint $ENDPOINT_NAME -f endpoints/online/ncd/sklearn-deployment.yaml --all-traffic
     ```
 
     # [Python (Azure Machine Learning SDK)](#tab/sdk)
 
@@ -319,19 +319,6 @@ So far, the endpoint is empty. There are no deployments on it. Let's create the
         instance_count=1,
     )
     ```
-
-    If your endpoint doesn't have egress connectivity, use [model packaging (preview)](how-to-package-models.md) by including the argument `with_package=True`:
-
-    ```python
-    blue_deployment = ManagedOnlineDeployment(
-        name=blue_deployment_name,
-        endpoint_name=endpoint_name,
-        model=model,
-        instance_type="Standard_DS2_v2",
-        instance_count=1,
-        with_package=True,
-    )
-    ```
     
     # [Python (MLflow SDK)](#tab/mlflow)
 
 
@@ -32,8 +32,6 @@ For no-code deployment, Azure Machine Learning:
   - Packages required for Azure Machine Learning to perform inference, including [`mlflow-skinny`](https://github.com/mlflow/mlflow/blob/master/skinny/README_SKINNY.md).
   - A scoring script to perform inference.
 
-[!INCLUDE [mlflow-model-package-for-workspace-without-egress](includes/mlflow-model-package-for-workspace-without-egress.md)]
-
 ### Packages and dependencies
 
 Azure Machine Learning automatically generates environments to run inference on MLflow models. To build the environments, Azure Machine Learning reads the conda dependencies that are specified in the MLflow model and adds any packages that are required to run the inferencing server. These extra packages vary depending on deployment type.
@@ -238,11 +236,10 @@ Each tool has different capabilities, particularly for which type of compute it
 | Deploy to web services like Azure Container Instances or Azure Kubernetes Service (AKS) | Legacy support<sup>2</sup> | Not supported<sup>2</sup> |
 | Deploy to web services like Container Instances or AKS with a scoring script | Not supported<sup>3</sup> | Legacy support<sup>2</sup> |
 
-<sup>1</sup> Deployment to online endpoints that are in workspaces with private link enabled requires you to [package models before deployment (preview)](how-to-package-models.md).
 
-<sup>2</sup> Switch to [managed online endpoints](concept-endpoints.md) if possible.
+<sup>1</sup> Switch to [managed online endpoints](concept-endpoints.md) if possible.
 
-<sup>3</sup> Open-source MLflow doesn't have the concept of a scoring script and doesn't support batch execution.
+<sup>2</sup> Open-source MLflow doesn't have the concept of a scoring script and doesn't support batch execution.
 
 ### Choose a deployment tool