Merge pull request #1639 from JKirsch1/update-mlflow-configuration-article

Court72 · web-flow · commit 5b16e7e73b44 · 2024-12-19T10:26:54.000-07:00
Freshness - Machine Learning HowTo 180 days
diff --git a/articles/machine-learning/how-to-use-mlflow-configure-tracking.md b/articles/machine-learning/how-to-use-mlflow-configure-tracking.md
@@ -1,76 +1,91 @@
 ---
 title: Configure MLflow for Azure Machine Learning
 titleSuffix: Azure Machine Learning
-description:  Connect MLflow to Azure Machine Learning workspaces to log metrics, artifacts, and deploy models.
+description: Find out how to connect MLflow to an Azure Machine Learning workspace to log metrics, track artifacts, and deploy models.
 services: machine-learning
 author: msakande
 ms.author: mopeakande
 ms.reviewer: fasantia
 ms.service: azure-machine-learning
 ms.subservice: mlops
-ms.date: 01/19/2024
+ms.date: 11/20/2024
 ms.topic: how-to
 ms.custom: mlflow, cliv2, devplatv2
 ms.devlang: azurecli
+# customer intent: As a developer, I want to see how to configure MLflow so that I can run MLflow training routines in Azure Machine Learning.
 ---
 
 # Configure MLflow for Azure Machine Learning
 
-This article explains how you can configure MLflow to connect to an Azure Machine Learning workspace for tracking, registries, and deployment.
+This article explains how to configure MLflow to connect to an Azure Machine Learning workspace for tracking, registry management, and deployment.
 
-Azure Machine Learning workspaces are MLflow-compatible, which means they can act as an MLflow server without any extra configuration. Each workspace has an MLflow tracking URI that MLflow can use to connect to the workspace. Azure Machine Learning workspaces **are already configured to work with MLflow** so no extra configuration is required.
+Azure Machine Learning workspaces are MLflow-compatible, which means they can act as MLflow servers without any extra configuration. Each workspace has an MLflow tracking URI that MLflow can use to connect to the workspace. Azure Machine Learning workspaces **are already configured to work with MLflow**, so no extra configuration is required.
 
-However, if you work outside of Azure Machine Learning (like your local machine, Azure Synapse Analytics, or Azure Databricks), you need to configure MLflow to point to the workspace.
+However, if you work outside Azure Machine Learning, you need to configure MLflow to point to the workspace. Affected environments include your local machine, Azure Synapse Analytics, and Azure Databricks.
 
 > [!IMPORTANT]
-> When running on Azure Compute (Azure Machine Learning Notebooks, Jupyter notebooks hosted on Azure Machine Learning compute instances, or jobs running on Azure Machine Learning compute clusters), you don't have to configure the tracking URI. **It's automatically configured for you**.
+> When you use Azure compute infrastructure, you don't have to configure the tracking URI. **It's automatically configured for you**. Environments with automatic configuration include Azure Machine Learning notebooks, Jupyter notebooks that are hosted on Azure Machine Learning compute instances, and jobs that run on Azure Machine Learning compute clusters.
 
 ## Prerequisites
 
-You need the following prerequisites to follow this tutorial:
+- The MLflow SDK `mlflow` package and the Azure Machine Learning `azureml-mlflow` plugin for MLflow. You can use the following command to install this software:
 
-[!INCLUDE [mlflow-prereqs](includes/machine-learning-mlflow-prereqs.md)]
+  ```bash
+  pip install mlflow azureml-mlflow
+  ```
 
+  > [!TIP]
+  > Instead of `mlflow`, consider using [`mlflow-skinny`](https://github.com/mlflow/mlflow/blob/master/README_SKINNY.rst). This package is a lightweight MLflow package without SQL storage, server, UI, or data science dependencies. It's recommended for users who primarily need MLflow tracking and logging capabilities but don't want to import the full suite of features, including deployments.
 
-## Configure MLflow tracking URI
+- An Azure Machine Learning workspace. To create a workspace, see [Create resources you need to get started](quickstart-create-resources.md).
 
-To connect MLflow to an Azure Machine Learning workspace, you need the tracking URI for the workspace. Each workspace has its own tracking URI and it has the protocol `azureml://`.
+- Access permissions for performing MLflow operations in your workspace. For a list of operations and required permissions, see [MLflow operations](how-to-assign-roles.md#mlflow-operations).
+
+## Configure the MLflow tracking URI
+
+To do remote tracking, or track experiments running outside Azure Machine Learning, configure MLflow to point to the tracking URI of your Azure Machine Learning workspace.
+
+To connect MLflow to an Azure Machine Learning workspace, you need the tracking URI of the workspace. Each workspace has its own tracking URI, which starts with the protocol `azureml://`.
 
 [!INCLUDE [mlflow-configure-tracking](includes/machine-learning-mlflow-configure-tracking.md)]
 
 ## Configure authentication
 
-Once the tracking is set, you also need to configure the authentication method for the associated workspace. By default, the Azure Machine Learning plugin for MLflow performs interactive authentication by opening the default browser to prompt for credentials.
+After you set up tracking, you also need to configure the authentication method for the associated workspace.
+
+By default, the Azure Machine Learning plugin for MLflow performs interactive authentication by opening the default browser to prompt for credentials. But the plugin also supports several other authentication mechanisms. The `azure-identity` package provides this support. This package is installed as a dependency of the `azureml-mlflow` plugin.
 
-The Azure Machine Learning plugin for MLflow supports several authentication mechanisms through the package `azure-identity`, which is installed as a dependency for the plugin `azureml-mlflow`. The following authentication methods are tried one by one until one of them succeeds:
+The authentication process tries the following methods, one after another, until one succeeds:
 
-1. __Environment__: Reads account information specified via environment variables and uses it to authenticate.
-1. __Managed Identity__: If the application is deployed to an Azure host with Managed Identity enabled, it authenticates with it.
-1. __Azure CLI__: If a user signs in via the Azure CLI `az login` command, it authenticates as that user.
-1. __Azure PowerShell__: If a user signs in via Azure PowerShell's `Connect-AzAccount` command, it authenticates as that user.
-1. __Interactive browser__: Interactively authenticates a user via the default browser.
+1. **Environment**: Account information that's specified via environment variables is read and used for authentication.
+1. **Managed identity**: If the application is deployed to an Azure host with a managed identity enabled, the managed identity is used for authentication.
+1. **Azure CLI**: If you use the Azure CLI `az login` command to sign in, your credentials are used for authentication.
+1. **Azure PowerShell**: If you use the Azure PowerShell `Connect-AzAccount` command to sign in, your credentials are used for authentication.
+1. **Interactive browser**: The user is interactively authenticated via the default browser.
 
 [!INCLUDE [mlflow-configure-auth](includes/machine-learning-mlflow-configure-auth.md)]
 
-If you'd rather use a certificate instead of a secret, you can configure the environment variables `AZURE_CLIENT_CERTIFICATE_PATH` to the path to a `PEM` or `PKCS12` certificate file (including private key) and 
-`AZURE_CLIENT_CERTIFICATE_PASSWORD` with the password of the certificate file, if any.
+If you'd rather use a certificate than a secret, you can configure the following environment variables:
+
+- Set `AZURE_CLIENT_CERTIFICATE_PATH` to the path of a file that contains the certificate and private key pair in Privacy Enhanced Mail (PEM) or Public-Key Cryptography Standards 12 (PKCS #12) format.
+- Set `AZURE_CLIENT_CERTIFICATE_PASSWORD` to the password of the certificate file, if it uses a password.
 
 ### Configure authorization and permission levels
 
-Some [default roles](how-to-assign-roles.md#default-roles) like *AzureML Data Scientist* or *Contributor* are already configured to perform MLflow operations in an Azure Machine Learning workspace. If using a custom role, you need the following permissions:
+Some [default roles](how-to-assign-roles.md#default-roles) like AzureML Data Scientist and Contributor are already configured to perform MLflow operations in an Azure Machine Learning workspace. If you use a custom role, you need the following permissions:
 
-* **To use MLflow tracking:** 
-    * `Microsoft.MachineLearningServices/workspaces/experiments/*`
-    * `Microsoft.MachineLearningServices/workspaces/jobs/*`
+- **To use MLflow tracking:**
+  - `Microsoft.MachineLearningServices/workspaces/experiments/*`
+  - `Microsoft.MachineLearningServices/workspaces/jobs/*`
 
-* **To use MLflow model registry:**
-    * `Microsoft.MachineLearningServices/workspaces/models/*/*`
+- **To use the MLflow model registry:**
+  - `Microsoft.MachineLearningServices/workspaces/models/*/*`
 
-To learn how to grant access for the service principal you created or user account to your workspace, see [Grant access](/azure/role-based-access-control/quickstart-assign-role-user-portal#grant-access).
+To see how to grant access to your workspace to a service principal that you create or to your user account, see [Grant access](/azure/role-based-access-control/quickstart-assign-role-user-portal#grant-access).
 
-### Troubleshooting authentication
+### Troubleshoot authentication issues
 
-MLflow tries to authenticate to Azure Machine Learning on the first operation that interacts with the service, like `mlflow.set_experiment()` or `mlflow.start_run()`. If you find issues or unexpected authentication prompts during the process, you can increase the logging level to get more details about the error:
+MLflow tries to authenticate to Azure Machine Learning on the first operation that interacts with the service, like `mlflow.set_experiment()` or `mlflow.start_run()`. If you experience issues or unexpected authentication prompts during the process, you can increase the logging level to get more details about the error:
 
 ```python
 import logging
@@ -80,44 +95,45 @@ logging.getLogger("azure").setLevel(logging.DEBUG)
 
 ## Set experiment name (optional)
 
-All MLflow runs are logged to the active experiment. By default, runs are logged to an experiment named `Default` that is automatically created for you. You can configure the experiment where tracking is happening.
+All MLflow runs are logged to the active experiment. By default, runs are logged to an experiment named `Default` that's automatically created for you. You can configure the experiment that's used for tracking.
 
 > [!TIP]
-> When submitting jobs using Azure Machine Learning CLI v2, you can set the experiment name using the property `experiment_name` in the YAML definition of the job. You don't have to configure it on your training script. See [YAML: display name, experiment name, description, and tags](reference-yaml-job-command.md#yaml-display-name-experiment-name-description-and-tags) for details.
+>
+> When you use the Azure Machine Learning CLI v2 to submit jobs, you can set the experiment name by using the `experiment_name` property in the YAML definition of the job. You don't have to configure it in your training script. For more information, see [YAML: display name, experiment name, description, and tags](reference-yaml-job-command.md#yaml-display-name-experiment-name-description-and-tags).
 
 
 # [MLflow SDK](#tab/mlflow)
 
-Configure your experiment by using MLflow command [`mlflow.set_experiment()`](https://mlflow.org/docs/latest/python_api/mlflow.html#mlflow.set_experiment).
+Use the MLflow [`mlflow.set_experiment()`](https://mlflow.org/docs/latest/python_api/mlflow.html#mlflow.set_experiment) command to configure your experiment.
     
-```Python
-experiment_name = 'experiment_with_mlflow'
+```python
+experiment_name = "experiment_with_mlflow"
 mlflow.set_experiment(experiment_name)
 ```
 
-# [Using environment variables](#tab/environ)
+# [Environment variables](#tab/environ)
 
-You can also set one of the MLflow environment variables [MLFLOW_EXPERIMENT_NAME or MLFLOW_EXPERIMENT_ID](https://mlflow.org/docs/latest/cli.html#cmdoption-mlflow-run-arg-uri) with the experiment name. 
+Use the MLflow `MLFLOW_EXPERIMENT_NAME` or `MLFLOW_EXPERIMENT_ID` environment variable to configure your experiment. For more information, see [Command-Line Interface](https://mlflow.org/docs/latest/cli.html) or [mlflow.start_run](https://mlflow.org/docs/latest/python_api/mlflow.html#mlflow.start_run).
 
 ```bash
 export MLFLOW_EXPERIMENT_NAME="experiment_with_mlflow"
 ```
 
 ---
 
-## Nonpublic Azure Clouds support
+## Configure support for a nonpublic Azure cloud
 
-The Azure Machine Learning plugin for MLflow is configured by default to work with the global Azure cloud. However, you can configure the Azure cloud you're using by setting the environment variable `AZUREML_CURRENT_CLOUD`.
+The Azure Machine Learning plugin for MLflow is configured by default to work with the global Azure cloud. However, you can configure the Azure cloud you're using by setting the `AZUREML_CURRENT_CLOUD` environment variable:
 
 # [MLflow SDK](#tab/mlflow)
 
-```Python
+```python
 import os
 
 os.environ["AZUREML_CURRENT_CLOUD"] = "AzureChinaCloud"
 ```
 
-# [Using environment variables](#tab/environ)
+# [Environment variables](#tab/environ)
 
 ```bash
 export AZUREML_CURRENT_CLOUD="AzureChinaCloud"
@@ -133,11 +149,11 @@ az cloud list
 
 The current cloud has the value `IsActive` set to `True`.
 
-## Next steps
+## Related content
 
 Now that your environment is connected to your workspace in Azure Machine Learning, you can start to work with it.
 
-- [Track ML experiments and models with MLflow](how-to-use-mlflow-cli-runs.md)
-- [Manage models registries in Azure Machine Learning with MLflow](how-to-manage-models-mlflow.md)
-- [Train with MLflow Projects (Preview)](how-to-train-mlflow-projects.md)
+- [Track experiments and models with MLflow](how-to-use-mlflow-cli-runs.md)
+- [Manage models registry in Azure Machine Learning with MLflow](how-to-manage-models-mlflow.md)
+- [Train with MLflow Projects in Azure Machine Learning (preview)](how-to-train-mlflow-projects.md)
 - [Guidelines for deploying MLflow models](how-to-deploy-mlflow-models.md)
diff --git a/articles/machine-learning/includes/machine-learning-mlflow-configure-auth.md b/articles/machine-learning/includes/machine-learning-mlflow-configure-auth.md
@@ -2,38 +2,40 @@
 author: santiagxf
 ms.service: azure-machine-learning
 ms.topic: include
-ms.date: 08/16/2024
+ms.date: 11/20/2024
 ms.author: fasantia
 ---
 
-For interactive jobs where there's a user connected to the session, you can rely on Interactive Authentication and hence no further action is required.
+For interactive jobs where there's a user connected to the session, you can rely on interactive authentication. No further action is required.
 
 > [!WARNING]
-> *Interactive browser* authentication blocks code execution when it prompts for credentials. This approach isn't suitable for authentication in unattended environments like training jobs. We recommend that you configure a different authentication mode.
+> *Interactive browser* authentication blocks code execution when it prompts for credentials. This approach isn't suitable for authentication in unattended environments like training jobs. We recommend that you configure a different authentication mode in those environments.
 
-For those scenarios where unattended execution is required, you have to configure a service principal to communicate with Azure Machine Learning.
+For scenarios that require unattended execution, you need to configure a service principal to communicate with Azure Machine Learning. For information about creating a service principal, see [Configure a service principal](../how-to-setup-authentication.md#configure-a-service-principal).
+
+Use the tenant ID, client ID, and client secret of your service principal in the following code:
 
 # [MLflow SDK](#tab/mlflow)
 
 ```python
 import os
 
-os.environ["AZURE_TENANT_ID"] = "<AZURE_TENANT_ID>"
-os.environ["AZURE_CLIENT_ID"] = "<AZURE_CLIENT_ID>"
-os.environ["AZURE_CLIENT_SECRET"] = "<AZURE_CLIENT_SECRET>"
+os.environ["AZURE_TENANT_ID"] = "<Azure-tenant-ID>"
+os.environ["AZURE_CLIENT_ID"] = "<Azure-client-ID>"
+os.environ["AZURE_CLIENT_SECRET"] = "<Azure-client-secret>"
 ```
 
-# [Using environment variables](#tab/environ)
+# [Environment variables](#tab/environ)
 
 ```bash
-export AZURE_TENANT_ID="<AZURE_TENANT_ID>"
-export AZURE_CLIENT_ID="<AZURE_CLIENT_ID>"
-export AZURE_CLIENT_SECRET="<AZURE_CLIENT_SECRET>"
+export AZURE_TENANT_ID="<Azure-tenant-ID>"
+export AZURE_CLIENT_ID="<Azure-client-ID>"
+export AZURE_CLIENT_SECRET="<Azure-client-secret>"
 ```
 
 ---
 
 > [!TIP]
-> When working on shared environments, we recommend that you configure these environment variables at the compute. As a best practice, manage them as secrets in an instance of Azure Key Vault.
+> When you work in shared environments, we recommend that you configure these environment variables at the compute level. As a best practice, manage them as secrets in an instance of Azure Key Vault.
 >
-> For instance, in Azure Databricks you can use secrets in environment variables as follows in the cluster configuration: `AZURE_CLIENT_SECRET={{secrets/<scope-name>/<secret-name>}}`. For more information about implementing this approach in Azure Databricks, see [Reference a secret in an environment variable](/azure/databricks/security/secrets/secrets#reference-a-secret-in-an-environment-variable) or refer to documentation for your platform.
+> For instance, in an Azure Databricks cluster configuration, you can use secrets in environment variables in the following way: `AZURE_CLIENT_SECRET={{secrets/<scope-name>/<secret-name>}}`. For more information about implementing this approach in Azure Databricks, see [Reference a secret in an environment variable](/azure/databricks/security/secrets/secrets#reference-a-secret-in-an-environment-variable), or refer to documentation for your platform.
diff --git a/articles/machine-learning/includes/machine-learning-mlflow-configure-tracking.md b/articles/machine-learning/includes/machine-learning-mlflow-configure-tracking.md