Merge pull request #204413 from santiagxf/santiagxf/aml-mlflow-projects

v-stsavell · web-flow · commit 788a565b796a · 2022-07-12T15:43:10.000-05:00
AML MLflow projects upgrade to V2
diff --git a/articles/machine-learning/how-to-train-mlflow-projects.md b/articles/machine-learning/how-to-train-mlflow-projects.md
@@ -14,9 +14,6 @@ ms.custom: how-to, devx-track-python, sdkv1, event-tier1-build-2022
 
 # Train ML models with MLflow Projects and Azure Machine Learning (preview)
 
-[!INCLUDE [sdk v1](../../includes/machine-learning-sdk-v1.md)]
-
-
 [!INCLUDE [preview disclaimer](../../includes/machine-learning-preview-generic-disclaimer.md)]
 
 In this article, learn how to enable MLflow's tracking URI and logging API, collectively known as [MLflow Tracking](https://mlflow.org/docs/latest/quickstart.html#using-the-tracking-api), to submit training jobs with [MLflow Projects](https://www.mlflow.org/docs/latest/projects.html) and Azure Machine Learning backend support. You can submit jobs locally with Azure Machine Learning tracking or migrate your runs to the cloud like via an [Azure Machine Learning Compute](./how-to-create-attach-compute-cluster.md).
@@ -33,37 +30,104 @@ In this article, learn how to enable MLflow's tracking URI and logging API, coll
 ## Prerequisites
 
 * Install the `azureml-mlflow` package.
-    * This package automatically brings in `azureml-core` of the [The Azure Machine Learning Python SDK](/python/api/overview/azure/ml/install), which provides the connectivity for MLflow to access your workspace.
 * [Create an Azure Machine Learning Workspace](quickstart-create-resources.md).
     * See which [access permissions you need to perform your MLflow operations with your workspace](how-to-assign-roles.md#mlflow-operations).
+ * Configure MLflow for tracking in Azure Machine Learning, as explained in the next section.
 
-## Train MLflow Projects on local compute
+### Set up tracking environment
+
+To configure MLflow for working with Azure Machine Learning, you need to point your MLflow environment to the Azure Machine Learning MLflow Tracking URI. 
+
+> [!NOTE]
+> When running on Azure Compute (Azure Notebooks, Jupyter Notebooks hosted on Azure Compute Instances or Compute Clusters) you don't have to configure the tracking URI. It's automatically configured for you.
+ 
+# [Using the Azure ML SDK v2](#tab/azuremlsdk)
+
+[!INCLUDE [sdk v2](../../includes/machine-learning-sdk-v2.md)]
+
+You can get the Azure ML MLflow tracking URI using the [Azure Machine Learning SDK v2 for Python](concept-v2.md). Ensure you have the library `azure-ai-ml` installed in the cluster you are using. The following sample gets the unique MLFLow tracking URI associated with your workspace. Then the method [`set_tracking_uri()`](https://mlflow.org/docs/latest/python_api/mlflow.html#mlflow.set_tracking_uri) points the MLflow tracking URI to that URI.
+
+1. Using the workspace configuration file:
+
+    ```Python
+    from azure.ai.ml import MLClient
+    from azure.identity import DefaultAzureCredential
+    import mlflow
 
-This example shows how to submit MLflow projects locally with Azure Machine Learning tracking.
+    ml_client = MLClient.from_config(credential=DefaultAzureCredential()
+    azureml_mlflow_uri = ml_client.workspaces.get(ml_client.workspace_name).mlflow_tracking_uri
+    mlflow.set_tracking_uri(azureml_mlflow_uri)
+    ```
 
-Install the `azureml-mlflow` package to use MLflow Tracking with Azure Machine Learning on your experiments locally. Your experiments can run via a Jupyter Notebook or code editor.
+    > [!TIP]
+    > You can download the workspace configuration file by:
+    > 1. Navigate to [Azure ML studio](https://ml.azure.com)
+    > 2. Click on the uper-right corner of the page -> Download config file.
+    > 3. Save the file `config.json` in the same directory where you are working on.
 
-```shell
-pip install azureml-mlflow
+1. Using the subscription ID, resource group name and workspace name:
+
+    ```Python
+    from azure.ai.ml import MLClient
+    from azure.identity import DefaultAzureCredential
+    import mlflow
+
+    #Enter details of your AzureML workspace
+    subscription_id = '<SUBSCRIPTION_ID>'
+    resource_group = '<RESOURCE_GROUP>'
+    workspace_name = '<AZUREML_WORKSPACE_NAME>'
+
+    ml_client = MLClient(credential=DefaultAzureCredential(),
+                         subscription_id=subscription_id, 
+                         resource_group_name=resource_group)
+
+    azureml_mlflow_uri = ml_client.workspaces.get(workspace_name).mlflow_tracking_uri
+    mlflow.set_tracking_uri(azureml_mlflow_uri)
+    ```
+
+    > [!IMPORTANT]
+    > `DefaultAzureCredential` will try to pull the credentials from the available context. If you want to specify credentials in a different way, for instance using the web browser in an interactive way, you can use `InteractiveBrowserCredential` or any other method available in `azure.identity` package.
+
+# [Using an environment variable](#tab/environ)
+
+[!INCLUDE [cli v2](../../includes/machine-learning-cli-v2.md)]
+
+Another option is to set one of the MLflow environment variables [MLFLOW_TRACKING_URI](https://mlflow.org/docs/latest/tracking.html#logging-to-a-tracking-server) directly in your terminal. 
+
+```Azure CLI
+export MLFLOW_TRACKING_URI=$(az ml workspace show --query mlflow_tracking_uri | sed 's/"//g') 
 ```
 
-Import the `mlflow` and [`Workspace`](/python/api/azureml-core/azureml.core.workspace%28class%29) classes to access MLflow's tracking URI and configure your workspace.
+>[!IMPORTANT]
+> Make sure you are logged in to your Azure account on your local machine, otherwise the tracking URI returns an empty string. If you are using any Azure ML compute the tracking environment and experiment name is already configured.
+
+# [Building the MLflow tracking URI](#tab/build)
 
-```Python
+The Azure Machine Learning Tracking URI can be constructed using the subscription ID, region of where the resource is deployed, resource group name and workspace name. The following code sample shows how:
+
+```python
 import mlflow
-from azureml.core import Workspace
 
-ws = Workspace.from_config()
+region = ""
+subscription_id = ""
+resource_group = ""
+workspace_name = ""
 
-mlflow.set_tracking_uri(ws.get_mlflow_tracking_uri())
+azureml_mlflow_uri = f"azureml://{region}.api.azureml.ms/mlflow/v1.0/subscriptions/{subscription_id}/resourceGroups/{resource_group}/providers/Microsoft.MachineLearningServices/workspaces/{workspace_name}"
+mlflow.set_tracking_uri(azureml_mlflow_uri)
 ```
 
-Set the MLflow experiment name with `set_experiment()` and start your training run with `start_run()`. Then, use `log_metric()` to activate the MLflow logging API and begin logging your training run metrics.
+> [!NOTE]
+> You can also get this URL by: 
+> 1. Navigate to [Azure ML studio](https://ml.azure.com)
+> 2. Click on the uper-right corner of the page -> View all properties in Azure Portal -> MLflow tracking URI.
+> 3. Copy the URI and use it with the method `mlflow.set_tracking_uri`.
 
-```Python
-experiment_name = 'experiment-with-mlflow-projects'
-mlflow.set_experiment(experiment_name)
-```
+---
+
+## Train MLflow Projects on local compute
+
+This example shows how to submit MLflow projects locally with Azure Machine Learning.
 
 Create the backend configuration object to store necessary information for the integration such as, the compute target and which type of managed environment to use.
 
@@ -106,30 +170,6 @@ local_env_run = mlflow.projects.run(uri=".",
 
 This example shows how to submit MLflow projects on a remote compute with Azure Machine Learning tracking.
 
-Install the `azureml-mlflow` package to use MLflow Tracking with Azure Machine Learning on your experiments locally. Your experiments can run via a Jupyter Notebook or code editor.
-
-```shell
-pip install azureml-mlflow
-```
-
-Import the `mlflow` and [`Workspace`](/python/api/azureml-core/azureml.core.workspace%28class%29) classes to access MLflow's tracking URI and configure your workspace.
-
-```Python
-import mlflow
-from azureml.core import Workspace
-
-ws = Workspace.from_config()
-
-mlflow.set_tracking_uri(ws.get_mlflow_tracking_uri())
-```
-
-Set the MLflow experiment name with `set_experiment()` and start your training run with `start_run()`. Then, use `log_metric()` to activate the MLflow logging API and begin logging your training run metrics.
-
-```Python
-experiment_name = 'train-mlflow-project-amlcompute'
-mlflow.set_experiment(experiment_name)
-```
-
 Create the backend configuration object to store necessary information for the integration such as, the compute target and which type of managed environment to use.
 
 The integration accepts "COMPUTE" and "USE_CONDA" as parameters where "COMPUTE" is set to the name of your remote compute cluster and "USE_CONDA" which creates a new environment for the project from the environment configuration file. If "COMPUTE" is present in the object, the project will be automatically submitted to the remote compute and ignore "USE_CONDA". MLflow accepts a dictionary object or a JSON file.