Skip to content

Commit 47d05a0

Browse files
committed
Edit content
1 parent 337a2fd commit 47d05a0

File tree

1 file changed

+51
-49
lines changed

1 file changed

+51
-49
lines changed

articles/machine-learning/how-to-log-mlflow-models.md

Lines changed: 51 additions & 49 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
---
2-
title: Logging MLflow models
2+
title: Log MLflow models
33
titleSuffix: Azure Machine Learning
44
description: Logging MLflow models, instead of artifacts, with MLflow SDK in Azure Machine Learning
55
services: machine-learning
@@ -13,31 +13,29 @@ ms.topic: conceptual
1313
ms.custom: cliv2, sdkv2
1414
---
1515

16-
# Logging MLflow models
16+
# Log MLflow models
1717

18-
This article describes how to log your trained models (or artifacts) as MLflow models. It explores the different ways to customize how MLflow packages your models, and how it runs those models.
18+
This article describes how to log your trained models (or artifacts) as MLflow models. It explores various ways of customizing how MLflow packages and runs models.
1919

20-
## Why logging models instead of artifacts?
20+
## Why log models instead of artifacts?
2121

22-
[From artifacts to models in MLflow](concept-mlflow-models.md) describes the difference between logging artifacts or files, as compared to logging MLflow models.
22+
An MLflow model is a type of artifact. However, a model has a specific structure that serves as a contract between the person that creates the model and the person that intends to use it. This contract helps build a bridge between the artifacts themselves and their meanings.
2323

24-
An MLflow model is also an artifact. However, that model has a specific structure that serves as a contract between the person that created the model and the person that intends to use it. This contract helps build a bridge between the artifacts themselves and their meanings.
24+
For the difference between logging artifacts, or files, and logging MLflow models, see [Artifacts and models in MLflow](concept-mlflow-models.md).
2525

26-
Model logging has these advantages:
27-
> [!div class="checklist"]
28-
> * You can directly load models, for inference, with `mlflow.<flavor>.load_model`, and you can use the `predict` function
29-
> * Pipeline inputs can use models directly
30-
> * You can deploy models without indication of a scoring script or an environment
31-
> * Swagger is automatically enabled in deployed endpoints, and the Azure Machine Learning studio can use the __Test__ feature
32-
> * You can use the Responsible AI dashboard
26+
You can log your model's files as artifacts, but model logging offers the following advantages:
3327

34-
This section describes how to use the model's concept in Azure Machine Learning with MLflow:
28+
* You can use `mlflow.<flavor>.load_model` to directly load models for inference, and you can use the `predict` function.
29+
* Pipeline inputs can use models directly.
30+
* You can deploy models without specifying a scoring script or an environment.
31+
* Swagger is automatically turned on in deployed endpoints. As a result, you can use the Azure Machine Learning studio test feature.
32+
* You can use the Responsible AI dashboard. For more information, see [Use the Responsible AI dashboard in Azure Machine Learning studio](how-to-responsible-ai-dashboard.md).
3533

36-
## Logging models using autolog
34+
## Use automatic logging to log models
3735

38-
You can use MLflow autolog functionality. Autolog allows MLflow to instruct the framework in use to log all the metrics, parameters, artifacts, and models that the framework considers relevant. By default, if autolog is enabled, most models are logged. In some situations, some flavors might not log a model. For instance, the PySpark flavor doesn't log models that exceed a certain size.
36+
You can use MLflow `autolog` functionality to automatically log models. When you use automatic logging, MLflow instructs the framework that's in use to log all the metrics, parameters, artifacts, and models that the framework considers relevant. By default, if automatic logging is turned on, most models are logged. In some situations, some flavors don't log models. For instance, the PySpark flavor doesn't log models that exceed a certain size.
3937

40-
Use either `mlflow.autolog()` or `mlflow.<flavor>.autolog()` to activate autologging. This example uses `autolog()` to log a classifier model trained with XGBoost:
38+
Use either `mlflow.autolog` or `mlflow.<flavor>.autolog` to activate automatic logging. The following code uses `autolog` to log a classifier model that's trained with XGBoost:
4139

4240
```python
4341
import mlflow
@@ -54,21 +52,21 @@ accuracy = accuracy_score(y_test, y_pred)
5452
```
5553

5654
> [!TIP]
57-
> If use Machine Learning pipelines, for example [Scikit-Learn pipelines](https://scikit-learn.org/stable/modules/generated/sklearn.pipeline.Pipeline.html), use the `autolog` functionality of that pipeline flavor to log models. Model logging automatically happens when the `fit()` method is called on the pipeline object. The [Training and tracking an XGBoost classifier with MLflow notebook](https://github.com/Azure/azureml-examples/blob/main/sdk/python/using-mlflow/train-and-log/xgboost_classification_mlflow.ipynb) demonstrates how to log a model with preprocessing, using pipelines.
55+
> If you use machine learning pipelines, for example [Scikit-Learn pipelines](https://scikit-learn.org/stable/modules/generated/sklearn.pipeline.Pipeline.html), use the `autolog` functionality of that pipeline flavor to log models. Model logging automatically happens when the `fit` method is called on the pipeline object. For a notebook that logs a model and that includes preprocessing and uses pipelines, see [Training and tracking an XGBoost classifier with MLflow](https://github.com/Azure/azureml-examples/blob/main/sdk/python/using-mlflow/train-and-log/xgboost_classification_mlflow.ipynb).
5856
59-
## Logging models with a custom signature, environment or samples
57+
## Log models that use a custom signature, environment, or samples
6058

61-
The MLflow `mlflow.<flavor>.log_model` method can manually log models. This workflow can control different aspects of the model logging.
59+
You can use the MLflow `mlflow.<flavor>.log_model` method to manually log models. This workflow can control various aspects of model logging.
6260

6361
Use this method when:
64-
> [!div class="checklist"]
65-
> * You want to indicate pip packages or a conda environment that differ from those that are automatically detected
66-
> * You want to include input examples
67-
> * You want to include specific artifacts in the needed package
68-
> * `autolog` does not correctly infer your signature. This matters when you deal with tensor inputs, where the signature needs specific shapes
69-
> * The autolog behavior does not cover your purpose for some reason
7062

71-
This code example logs a model for an XGBoost classifier:
63+
* You want to indicate pip packages or a Conda environment that differs from the automatically detected packages or environment.
64+
* You want to include input examples.
65+
* You want to include specific artifacts in the package that you need.
66+
* The `autolog` method doesn't correctly infer your signature. This case comes up when you work with tensor inputs, which require the signature to have a specific shape.
67+
* The `autolog` method doesn't meet all your needs.
68+
69+
The following code logs an XGBoost classifier model:
7270

7371
```python
7472
import mlflow
@@ -85,20 +83,20 @@ y_pred = model.predict(X_test)
8583

8684
accuracy = accuracy_score(y_test, y_pred)
8785

88-
# Signature
86+
# Infer the signature.
8987
signature = infer_signature(X_test, y_test)
9088

91-
# Conda environment
89+
# Set up a Conda environment.
9290
custom_env =_mlflow_conda_env(
9391
additional_conda_deps=None,
9492
additional_pip_deps=["xgboost==1.5.2"],
9593
additional_conda_channels=None,
9694
)
9795

98-
# Sample
96+
# Sample the data.
9997
input_example = X_train.sample(n=1)
10098

101-
# Log the model manually
99+
# Log the model manually.
102100
mlflow.xgboost.log_model(model,
103101
artifact_path="classifier",
104102
conda_env=custom_env,
@@ -107,19 +105,21 @@ mlflow.xgboost.log_model(model,
107105
```
108106

109107
> [!NOTE]
110-
> * `autolog` has the `log_models=False` configuration. This prevents automatic MLflow model logging. Automatic MLflow model logging happens later, as a manual process
111-
> * Use the `infer_signature` method to try to infer the signature directly from inputs and outputs
112-
> * The `mlflow.utils.environment._mlflow_conda_env` method is a private method in the MLflow SDK. In this example, it makes the code simpler, but use it with caution. It may change in the future. As an alternative, you can generate the YAML definition manually as a Python dictionary.
108+
> * The call to `autolog` uses a configuration of `log_models=False`. This setting turns off automatic MLflow model logging. The `log_model` method is used later to manually log the model.
109+
> * The `infer_signature` method is used to try to infer the signature directly from inputs and outputs.
110+
> * The `mlflow.utils.environment._mlflow_conda_env` method is a private method in the MLflow SDK. In this example, it streamlines the code. But use this method with caution, because it might change in the future. As an alternative, you can generate the YAML definition manually as a Python dictionary.
113111
114-
## Logging models with a different behavior in the predict method
112+
## Log models with a different behavior in the predict method
115113

116-
When logging a model with either `mlflow.autolog` or `mlflow.<flavor>.log_model`, the model flavor determines how to execute the inference, and what the model returns. MLflow doesn't enforce any specific behavior about the generation of `predict` results. In some scenarios, you might want to do some preprocessing or post-processing before and after your model executes.
114+
When you use `mlflow.autolog` or `mlflow.<flavor>.log_model` to log a model, the model flavor determines how to perform the inference. The flavor also determines what the model returns. MLflow doesn't enforce any specific behavior about the generation of `predict` results. In some scenarios, you might want to do some preprocessing or post-processing before and after your model runs.
117115

118-
In this situation, implement machine learning pipelines that directly move from inputs to outputs. Although this implementation is possible, and sometimes encouraged to improve performance, it might become challenging to achieve. In those cases, it can help to [customize how your model handles inference](#logging-custom-models) as explained in next section.
116+
In this situation, you can implement machine learning pipelines that directly move from inputs to outputs. Although this type of implementation can sometimes improve performance, it can be challenging to achieve.
119117

120118
## Logging custom models
121119

122-
MLflow supports many [machine learning frameworks](https://mlflow.org/docs/latest/models.html#built-in-model-flavors), including
120+
In cases that involve implementing pipelines that directly move from inputs to outputs, it can help to customize the way your model handles inference.
121+
122+
MLflow supports many machine learning frameworks, including the following flavors:
123123

124124
- CatBoost
125125
- FastAI
@@ -138,25 +138,27 @@ MLflow supports many [machine learning frameworks](https://mlflow.org/docs/lates
138138
- TensorFlow
139139
- XGBoost
140140

141-
However, you might need to change the way a flavor works, log a model not natively supported by MLflow or even log a model that uses multiple elements from different frameworks. In these cases, you might need to create a custom model flavor.
141+
For a complete list, see [Built-In Model Flavors](https://mlflow.org/docs/latest/models.html#built-in-model-flavors).
142142

143-
To solve the problem, MLflow introduces the `pyfunc` flavor (starting from a Python function). This flavor can log any object as a model, as long as that object satisfies two conditions:
143+
However, you might need to change the way a flavor works or log a model that's not natively supported by MLflow. Or you might need to log a model that uses multiple elements from various frameworks. In these cases, you might need to create a custom model flavor.
144144

145-
* You implement the method `predict` method, at least
146-
* The Python object inherits from `mlflow.pyfunc.PythonModel`
145+
To solve the problem, MLflow offers the `pyfunc` flavor, a default model interface for Python models. This flavor can log any object as a model, as long as that object satisfies two conditions:
146+
147+
* You implement at least the `predict` method.
148+
* The Python object inherits from the `mlflow.pyfunc.PythonModel` class.
147149

148150
> [!TIP]
149-
> Serializable models that implement the Scikit-learn API can use the Scikit-learn flavor to log the model, regardless of whether the model was built with Scikit-learn. If you can persist your model in Pickle format, and the object has the `predict()` and `predict_proba()` methods (at least), you can use `mlflow.sklearn.log_model()` to log the model inside a MLflow run.
151+
> Serializable models that implement the Scikit-learn API can use the `Scikit-learn` flavor to log the model, regardless of whether the model was built with `Scikit-learn`. If you can persist your model in Pickle format, and the object has at least the `predict` and `predict_proba` methods, you can use `mlflow.sklearn.log_model` to log the model inside an MLflow run.
150152
151-
# [Using a model wrapper](#tab/wrapper)
153+
# [Use a model wrapper](#tab/wrapper)
152154

153-
If you create a wrapper around your existing model object, it becomes the simplest to create a flavor for your custom model. MLflow serializes and packages it for you. Python objects are serializable when the object can be stored in the file system as a file, generally in Pickle format. At runtime, the object can materialize from that file. This restores all the values, properties, and methods available when it was saved.
155+
The easiest way to create a flavor for your custom model is to create a wrapper around your existing model object. MLflow serializes and packages it for you. Python objects are serializable when the object can be stored in the file system as a file, generally in Pickle format. At runtime, the object can materialize from that file. This restores all the values, properties, and methods available when it was saved.
154156

155157
Use this method when:
156-
> [!div class="checklist"]
157-
> * You can serialize your model in Pickle format
158-
> * You want to retain the state of the model, as it was just after training
159-
> * You want to customize how the `predict` function works.
158+
159+
* You can serialize your model in Pickle format.
160+
* You want to retain the state of the model just after training.
161+
* You want to customize how the `predict` function works.
160162

161163
This code sample wraps a model created with XGBoost, to make it behave in a different from the XGBoost flavor default implementation. Instead, it returns the probabilities instead of the classes:
162164

0 commit comments

Comments
 (0)