Merge pull request #203824 from santiagxf/santigxf/aml-mlflow-review

ShannonLeavitt · web-flow · commit 306c427872a8 · 2022-07-06T11:47:00.000-06:00
AML mlflow review
diff --git a/articles/machine-learning/concept-mlflow.md b/articles/machine-learning/concept-mlflow.md
@@ -54,7 +54,7 @@ Azure Machine Learning uses MLflow Tracking for metric logging and artifact stor
 
 ## Model Registries with MLflow
 
-Azure Machine Learning supports MLflow for model management. This represents a convenient way to support the entire model lifecycle for users familiar with the MLFlow client. The following article describes the different capabilities and how it compares with other options.
+Azure Machine Learning supports MLflow for model management. This represents a convenient way to support the entire model lifecycle for users familiar with the MLFlow client.
 
 To learn more about how you can manage models using the MLflow API in Azure Machine Learning, view [Manage models registries in Azure Machine Learning with MLflow](how-to-manage-models-mlflow.md).
 
diff --git a/articles/machine-learning/how-to-log-view-metrics.md b/articles/machine-learning/how-to-log-view-metrics.md
@@ -1,5 +1,5 @@
 ---
-title: Log & view metrics and log files
+title: Log & view parameters, metrics and files
 titleSuffix: Azure Machine Learning
 description: Enable logging on your ML training runs to monitor real-time run metrics with MLflow, and to help diagnose errors and warnings.
 services: machine-learning
@@ -17,12 +17,12 @@ ms.custom: sdkv1, event-tier1-build-2022
 
 > [!div class="op_single_selector" title1="Select the version of Azure Machine Learning Python SDK you are using:"]
 > * [v1](./v1/how-to-log-view-metrics.md)
-> * [v2 (preview)](how-to-log-view-metrics.md)
+> * [v2 (current)](how-to-log-view-metrics.md)
 
-Log real-time information using [MLflow Tracking](https://www.mlflow.org/docs/latest/tracking.html). You can log models, metrics, and artifacts with MLflow as it supports local mode to cloud portability.
+Azure Machine Learning supports logging and tracking experiments using [MLflow Tracking](https://www.mlflow.org/docs/latest/tracking.html). You can log models, metrics, parameters, and artifacts with MLflow as it supports local mode to cloud portability.
 
 > [!IMPORTANT]
-> Unlike the Azure Machine Learning SDK v1, there is no logging functionality in the SDK v2 preview.
+> Unlike the Azure Machine Learning SDK v1, there is no logging functionality in the Azure Machine Learning SDK for Python (v2). If you were using Azure Machine Learning SDK v1 before, we recommend you to start leveraging MLflow for tracking experiments. See [Migrate logging from SDK v1 to MLflow](reference-migrate-sdk-v1-mlflow-tracking.md) for specific guidance.
 
 Logs can help you diagnose errors and warnings, or track performance metrics like parameters and model performance. In this article, you learn how to enable logging in the following scenarios:
 
@@ -40,43 +40,110 @@ Logs can help you diagnose errors and warnings, or track performance metrics lik
 
 * To use Azure Machine Learning, you must have an Azure subscription. If you don't have an Azure subscription, create a free account before you begin. Try the [free or paid version of Azure Machine Learning](https://azure.microsoft.com/free/).
 * You must have an Azure Machine Learning workspace. A workspace is created in [Install, set up, and use the CLI (v2)](how-to-configure-cli.md).
-* You must have the `aureml-core`, `mlflow`, and `azure-mlflow` packages installed. If you don't, use the following command to install them in your development environment:
+* You must have `mlflow`, and `azureml-mlflow` packages installed. If you don't, use the following command to install them in your development environment:
 
     ```bash
-    pip install azureml-core mlflow azureml-mlflow
+    pip install mlflow azureml-mlflow
     ```
 
-## Data types
+> [!IMPORTANT]
+> If you are running outside of any Azure Machine Learning Compute and you want to do remote tracking (running your training routine in other compute but tracking on Azure Machine Learning), you must have MLflow configured to do tracking to your workspace. See [Setup your tracking environment](how-to-use-mlflow-cli-runs.md?#set-up-tracking-environment) for more details.
+
+## Logging parameters
+
+MLflow supports the logging parameters used by your experiments. Parameters can be of any type, and can be logged using the following syntax:
 
-The following table describes how to log specific value types:
+```python
+mlflow.log_param("num_epochs", 20)
+```
+
+MLflow also offers a convenient way to log multiple parameters by indicating all of them using a dictionary. Several frameworks can also pass parameters to models using dictionaries and hence this is a convenient way to log them in the experiment.
+
+```python
+params = {
+    "num_epochs": 20,
+    "dropout_rate": .6,
+    "objective": "binary_crossentropy"
+}
+
+mlflow.log_params(params)
+```
+
+> [!NOTE] 
+> Azure ML SDK v1 logging can't log parameters. We recommend the use of MLflow for tracking experiments as it offers a superior set of features.
+
+## Logging metrics
+
+Metrics, as opposite to parameters, are always numeric. The following table describes how to log specific numeric types:
 
 |Logged Value|Example code| Notes|
 |----|----|----|
-|Log a numeric value (int or float) | `mlflow.log_metric('my_metric', 1)`| |
-|Log a boolean value | `mlflow.log_metric('my_metric', 0)`| 0 = True, 1 = False|
-|Log a string | `mlflow.log_text('foo', 'my_string')`| Logged as an artifact|
-|Log numpy metrics or PIL image objects|`mlflow.log_image(img, 'figure.png')`||
-|Log matlotlib plot or image file|` mlflow.log_figure(fig, "figure.png")`||
+|Log a numeric value (int or float) | `mlflow.log_metric("my_metric", 1)`| |
+|Log a numeric value (int or float) over time | `mlflow.log_metric("my_metric", 1, step=1)`| Use parameter `step` to indicate the step at which you are logging the metric value. It can be any integer number. It defaults to zero. |
+|Log a boolean value | `mlflow.log_metric("my_metric", 0)`| 0 = True, 1 = False|
 
-## Log a training job with MLflow
+> [!IMPORTANT]
+> __Performance considerations:__ If you need to log multiple metrics (or multiple values for the same metric) avoid making calls to `mlflow.log_metric` in loops. Better performance can be achieved by logging batch of metrics. Use the method `mlflow.log_metrics` which accepts a dictionary with all the metrics you want to log at once or use `mlflow.log_batch` which accepts multiple type of elements for logging.
 
-To set up for logging with MLflow, import `mlflow` and set the tracking URI:
+### Logging curves or list of values
 
-> [!TIP]
-> You do not need to set the tracking URI when using a notebook running on an Azure Machine Learning compute instance.
+Curves (or list of numeric values) can be logged with MLflow by logging the same metric multiple times. The following example shows how to do it:
 
 ```python
-from azureml.core import Workspace
-import mlflow
+list_to_log = [1, 2, 3, 2, 1, 2, 3, 2, 1]
+from mlflow.entities import Metric
+from mlflow.tracking import MlflowClient
+import time
 
-ws = Workspace.from_config()
-# Set the tracking URI to the Azure ML backend
-# Not needed if running on Azure ML compute instance
-# or compute cluster
-mlflow.set_tracking_uri(ws.get_mlflow_tracking_uri())
+client = MlflowClient()
+client.log_batch(mlflow.active_run().info.run_id, 
+                 metrics=[Metric(key="sample_list", value=val, timestamp=int(time.time() * 1000), step=0) for val in list_to_log])
 ```
 
-### Interactive jobs
+## Logging images
+
+MLflow supports two ways of logging images:
+
+|Logged Value|Example code| Notes|
+|----|----|----|
+|Log numpy metrics or PIL image objects|`mlflow.log_image(img, "figure.png")`| `img` should be an instance of `numpy.ndarray` or `PIL.Image.Image`. `figure.png` is the name of the artifact that will be generated inside of the run. It doesn't have to be an existing file.|
+|Log matlotlib plot or image file|` mlflow.log_figure(fig, "figure.png")`| `figure.png` is the name of the artifact that will be generated inside of the run. It doesn't have to be an existing file. |
+
+## Logging other types of data
+
+|Logged Value|Example code| Notes|
+|----|----|----|
+|Log text in a text file | `mlflow.log_text("text string", "notes.txt")`| Text is persisted inside of the run in a text file with name `notes.txt`. |
+|Log dictionaries as `JSON` and `YAML` files | `mlflow.log_dict(dictionary, "file.yaml"` | `dictionary` is a dictionary object containing all the structure that you want to persist as `JSON` or `YAML` file. |
+|Log a trivial file already existing | `mlflow.log_artifact("path/to/file.pkl")`| Files are always logged in the root of the run. If `artifact_path` is provided, then the file is logged in a folder as indicated in that parameter. |
+|Log all the artifacts in an existing folder | `mlflow.log_artifacts("path/to/folder")`| Folder structure is copied to the run, but the root folder indicated is not included. |
+
+## Logging models
+
+MLflow introduces the concept of "models" as a way to package all the artifacts required for a given model to function. Models in MLflow are always a folder with an arbitrary number of files, depending on the framework used to generate the model. Logging models has the advantage of tracking all the elements of the model as a single entity that can be __registered__ and then __deployed__. On top of that, MLflow models enjoy the benefit of [no-code deployment](how-to-deploy-mlflow-models.md) and can be used with the [Responsible AI dashboard](how-to-responsible-ai-dashboard.md) in studio.
+
+To save the model from a training run, use the `log_model()` API for the framework you're working with. For example, [mlflow.sklearn.log_model()](https://mlflow.org/docs/latest/python_api/mlflow.sklearn.html#mlflow.sklearn.log_model). For frameworks that MLflow doesn't support, see [Convert custom models to MLflow](how-to-convert-custom-model-to-mlflow.md).
+
+## Automatic logging
+
+With Azure Machine Learning and MLFlow, users can log metrics, model parameters and model artifacts automatically when training a model.  A [variety of popular machine learning libraries](https://mlflow.org/docs/latest/tracking.html#automatic-logging) are supported. 
+
+To enable [automatic logging](https://mlflow.org/docs/latest/tracking.html#automatic-logging) insert the following code before your training code:
+
+```Python
+mlflow.autolog()
+```
+
+> [!TIP]
+> You can control what gets automatically logged wit autolog. For instance, if you indicate `mlflow.autolog(log_models=False)`, MLflow will log everything but models for you. Such control is useful in cases where you want to log models manually but still enjoy automatic logging of metrics and parameters. Also notice that some frameworks may disable automatic logging of models if the trained model goes behond specific boundaries. Such behavior depends on the flavor used and we recommend you to view they documentation if this is your case.
+
+[Learn more about Automatic logging with MLflow](https://mlflow.org/docs/latest/python_api/mlflow.html#mlflow.autolog). 
+
+## Configuring experiments and runs in Azure Machine Learning
+
+MLflow organizes the information in experiments and runs (in Azure Machine Learning, runs are called Jobs). There are some differences in how to configure them depending on how you are running your code:
+
+# [Training interactively](#tab/interactive)
 
 When training interactively, such as in a Jupyter Notebook, use the following pattern:
 
@@ -85,13 +152,11 @@ When training interactively, such as in a Jupyter Notebook, use the following pa
 1. Use logging methods to log metrics and other information.
 1. End the job.
 
-For example, the following code snippet demonstrates setting the tracking URI, creating an experiment, and then logging during a job
+For example, the following code snippet demonstrates configuring the experiment, and then logging during a job:
 
 ```python
-from mlflow.tracking import MlflowClient
-
-# Create a new experiment if one doesn't already exist
-mlflow.create_experiment("mlflow-experiment")
+import mlflow
+mlflow.set_experiment("mlflow-experiment")
 
 # Start the run, log metrics, end the run
 mlflow_run = mlflow.start_run()
@@ -105,10 +170,8 @@ mlflow.end_run()
 You can also use the context manager paradigm:
 
 ```python
-from mlflow.tracking import MlflowClient
-
-# Create a new experiment if one doesn't already exist
-mlflow.create_experiment("mlflow-experiment")
+import mlflow
+mlflow.set_experiment("mlflow-experiment")
 
 # Start the run, log metrics, end the run
 with mlflow.start_run() as run:
@@ -118,19 +181,35 @@ with mlflow.start_run() as run:
     pass
 ```
 
+When you start a new run with `mlflow.start_run`, it may be useful to indicate the parameter `run_name` which will then translate to the name of the run in Azure Machine Learning user interface and help you identify the run quicker:
+
+```python
+with mlflow.start_run(run_name="iris-classifier-random-forest") as run:
+    mlflow.log_metric('mymetric', 1)
+    mlflow.log_metric('anothermetric',1)
+```
+
 For more information on MLflow logging APIs, see the [MLflow reference](https://www.mlflow.org/docs/latest/python_api/mlflow.html#mlflow.log_artifact).
 
-### Remote runs
+# [Training with jobs](#tab/jobs)
 
-For remote training runs, the tracking URI and experiment are set automatically. Otherwise, the options for logging the run are the same as for interactive logging:
+When running training jobs in Azure Machine Learning you don't need to configure the MLflow tracking URI as it is already configured for you. On top of that, you don't need to call `mlflow.start_run` as runs are automatically started. Hence, you can use mlflow tracking capabilities directly in your training scripts:
 
-* Call `mlflow.start_run()`, log information, and then call `mlflow.end_run()`.
-* Use the context manager paradigm with `mlflow.start_run()`.
-* Call a logging API such as `mlflow.log_metric()`, which will start a run if one doesn't already exist.
+```python
+import mlflow
 
-## Log a model
+mlflow.set_experiment("my-experiment")
 
-To save the model from a training run, use the `log_model()` API for the framework you're working with. For example, [mlflow.sklearn.log_model()](https://mlflow.org/docs/latest/python_api/mlflow.sklearn.html#mlflow.sklearn.log_model). For frameworks that MLflow doesn't support, see [Convert custom models to MLflow](how-to-convert-custom-model-to-mlflow.md).
+mlflow.autolog()
+
+mlflow.log_metric('mymetric', 1)
+mlflow.log_metric('anothermetric',1)
+```
+
+> [!TIP]
+> When submitting jobs using Azure ML CLI v2, you can set the experiment name using the property `experiment_name` in the YAML definition of the job. You don't have to configure it on your training script. See [YAML: display name, experiment name, description, and tags](reference-yaml-job-command.md#yaml-display-name-experiment-name-description-and-tags) for details.
+
+---
 
 ## View job information
 
@@ -148,8 +227,8 @@ You can view the metrics, parameters, and tags for the run in the data field of
 
 ```python
 metrics = finished_mlflow_run.data.metrics
-tags = finished_mlflow_run.data.tags
 params = finished_mlflow_run.data.params
+tags = finished_mlflow_run.data.tags
 ```
 
 >[!NOTE]
diff --git a/articles/machine-learning/how-to-track-monitor-analyze-runs.md b/articles/machine-learning/how-to-track-monitor-analyze-runs.md
@@ -1,5 +1,5 @@
 ---
-title: Track, monitor, and analyze jobs in studio
+title: Monitor and analyze jobs in studio
 titleSuffix: Azure Machine Learning 
 description: Learn how to start, monitor, and track your machine learning experiment jobs with the Azure Machine Learning studio. 
 services: machine-learning
@@ -13,7 +13,7 @@ ms.topic: how-to
 ms.custom: devx-track-python, devx-track-azurecli, event-tier1-build-2022
 ---
 
-# Start, monitor, and track job history in studio
+# Monitor and analyze jobs in studio
 
 You can use [Azure Machine Learning studio](https://ml.azure.com) to monitor, organize, and track your jobs for training and experimentation. Your ML job history is an important part of an explainable and repeatable ML development process.
 
diff --git a/articles/machine-learning/how-to-use-mlflow-cli-runs.md b/articles/machine-learning/how-to-use-mlflow-cli-runs.md
@@ -71,13 +71,13 @@ import mlflow
 #Enter details of your AzureML workspace
 subscription_id = '<SUBSCRIPTION_ID>'
 resource_group = '<RESOURCE_GROUP>'
-workspace = '<AZUREML_WORKSPACE_NAME>'
+workspace_name = '<AZUREML_WORKSPACE_NAME>'
 
 ml_client = MLClient(credential=DefaultAzureCredential(),
                      subscription_id=subscription_id, 
                      resource_group_name=resource_group)
                      
-azureml_mlflow_uri = ml_client.workspaces.get(workspace).mlflow_tracking_uri
+azureml_mlflow_uri = ml_client.workspaces.get(workspace_name).mlflow_tracking_uri
 mlflow.set_tracking_uri(azureml_mlflow_uri)
 ```
 
@@ -102,12 +102,12 @@ The Azure Machine Learning Tracking URI can be constructed using the subscriptio
 ```python
 import mlflow
 
-aml_region = ""
+region = ""
 subscription_id = ""
 resource_group = ""
-workspace = ""
+workspace_name = ""
 
-azureml_mlflow_uri = f"azureml://{aml_region}.api.azureml.ms/mlflow/v1.0/subscriptions/{subscription_id}/resourceGroups/{resource_group}/providers/Microsoft.MachineLearningServices/workspaces/{workspace}"
+azureml_mlflow_uri = f"azureml://{region}.api.azureml.ms/mlflow/v1.0/subscriptions/{subscription_id}/resourceGroups/{resource_group}/providers/Microsoft.MachineLearningServices/workspaces/{workspace_name}"
 mlflow.set_tracking_uri(azureml_mlflow_uri)
 ```
 
@@ -125,6 +125,9 @@ experiment_name = 'experiment_with_mlflow'
 mlflow.set_experiment(experiment_name)
 ```
 
+> [!TIP]
+> When submitting jobs using Azure ML CLI v2, you can set the experiment name using the property `experiment_name` in the YAML definition of the job. You don't have to configure it on your training script. See [YAML: display name, experiment name, description, and tags](reference-yaml-job-command.md#yaml-display-name-experiment-name-description-and-tags) for details.
+
 You can also set one of the MLflow environment variables [MLFLOW_EXPERIMENT_NAME or MLFLOW_EXPERIMENT_ID](https://mlflow.org/docs/latest/cli.html#cmdoption-mlflow-run-arg-uri) with the experiment name. 
 
 ```bash
@@ -186,17 +189,6 @@ Open your terminal and use the following to submit the job.
 az ml job create -f job.yml --web
 ```
 
-## Automatic logging
-With Azure Machine Learning and MLFlow, users can log metrics, model parameters and model artifacts automatically when training a model.  A [variety of popular machine learning libraries](https://mlflow.org/docs/latest/tracking.html#automatic-logging) are supported. 
-
-To enable [automatic logging](https://mlflow.org/docs/latest/tracking.html#automatic-logging) insert the following code before your training code:
-
-```Python
-mlflow.autolog()
-```
-
-[Learn more about Automatic logging with MLflow](https://mlflow.org/docs/latest/python_api/mlflow.html#mlflow.autolog). 
-
 
 ## View metrics and artifacts in your workspace
 
@@ -274,14 +266,7 @@ To register and view a model from a run, use the following steps:
 
 ## Limitations
 
-The following MLflow methods are not fully supported with Azure Machine Learning. 
-
-* `mlflow.tracking.MlflowClient.create_experiment() `
-* `mlflow.tracking.MlflowClient.rename_experiment()`
-* `mlflow.tracking.MlflowClient.search_runs()`
-* `mlflow.tracking.MlflowClient.download_artifacts()`
-* `mlflow.tracking.MlflowClient.rename_registered_model()`
-
+Some methods available in the MLflow API may not be available when connected to Azure Machine Learning. For details about supported and unsupported operations please read [Support matrix for querying runs and experiments](how-to-track-experiments-mlflow.md#support-matrix-for-querying-runs-and-experiments).
 
 ## Next steps
 
diff --git a/articles/machine-learning/toc.yml b/articles/machine-learning/toc.yml