You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/machine-learning/concept-mlflow-models.md
+55-41Lines changed: 55 additions & 41 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,31 +1,32 @@
1
1
---
2
2
title: From artifacts to models in MLflow
3
3
titleSuffix: Azure Machine Learning
4
-
description: Learn about how MLflow uses the concept of models instead of artifacts to represent your trained models and enable a streamlined path to deployment.
4
+
description: Learn how MLflow uses the concept of models instead of artifacts to represent your trained models and enable a streamlined path to deployment.
5
5
services: machine-learning
6
6
author: santiagxf
7
7
ms.author: fasantia
8
8
ms.reviewer: mopeakande
9
+
reviewer: msakande
9
10
ms.service: machine-learning
10
11
ms.subservice: mlops
11
-
ms.date: 11/04/2022
12
+
ms.date: 12/20/2023
12
13
ms.topic: conceptual
13
14
ms.custom: cliv2, sdkv2
14
15
---
15
16
16
17
# From artifacts to models in MLflow
17
18
18
-
The following article explains the differences between an artifact and a model in MLflow and how to transition from one to the other. It also explains how Azure Machine Learning uses the MLflow model's concept to enabled streamlined deployment workflows.
19
+
The following article explains the differences between an MLflow artifact and an MLflow model, and how to transition from one to the other. It also explains how Azure Machine Learning uses the concept of an MLflow modelto enable streamlined deployment workflows.
19
20
20
21
## What's the difference between an artifact and a model?
21
22
22
-
If you are not familiar with MLflow, you may not be aware of the difference between logging artifacts or files vs. logging MLflow models. There are some fundamental differences between the two:
23
+
If you're not familiar with MLflow, you might not be aware of the difference between logging artifacts or files vs. logging MLflow models. There are some fundamental differences between the two:
23
24
24
-
### Artifacts
25
+
### Artifact
25
26
26
-
Any file generated (and captured) from an experiment's run or job is an artifact. It may represent a model serialized as a Pickle file, the weights of a PyTorch or TensorFlow model, or even a text file containing the coefficients of a linear regression. Other artifacts can have nothing to do with the model itself, but they can contain configuration to run the model, pre-processing information, sample data, etc. As you can see, an artifact can come in any format.
27
+
An _artifact_ is any file that's generated (and captured) from an experiment's run or job. An artifact could represent a model serialized as a pickle file, the weights of a PyTorch or TensorFlow model, or even a text file containing the coefficients of a linear regression. Some artifacts could also have nothing to do with the model itself; rather, they could contain configurations to run the model, or preprocessing information, or sample data, and so on. Artifacts can come in various formats.
27
28
28
-
You may have been logging artifacts already:
29
+
You might have been logging artifacts already:
29
30
30
31
```python
31
32
filename ='model.pkl'
@@ -35,18 +36,18 @@ with open(filename, 'wb') as f:
35
36
mlflow.log_artifact(filename)
36
37
```
37
38
38
-
### Models
39
+
### Model
39
40
40
-
A model in MLflow is also an artifact. However, we make stronger assumptions about this type of artifacts. Such assumptions provide a clear contract between the saved files and what they mean. When you log your models as artifacts (simple files), you need to know what the model builder meant for each of them in order to know how to load the model for inference. On the contrary, MLflow models can be loaded using the contract specified in the [The MLModel format](concept-mlflow-models.md#the-mlmodel-format).
41
+
A _model_ in MLflow is also an artifact. However, we make stronger assumptions about this type of artifact. Such assumptions provide a clear contract between the saved files and what they mean. When you log your models as artifacts (simple files), you need to know what the model builder meant for each of those files so as to know how to load the model for inference. On the contrary, MLflow models can be loaded using the contract specified in the [The MLmodel format](concept-mlflow-models.md#the-mlmodel-format).
41
42
42
43
In Azure Machine Learning, logging models has the following advantages:
43
-
> [!div class="checklist"]
44
-
> * You can deploy them on real-time or batch endpoints without providing an scoring script nor an environment.
45
-
> * When deployed, Model's deployments have a Swagger generated automatically and the __Test__ feature can be used in Azure Machine Learning studio.
46
-
> * Models can be used as pipelines inputs directly.
47
-
> * You can use the [Responsible AI dashbord (preview)](how-to-responsible-ai-dashboard.md).
48
44
49
-
Models can get logged by using MLflow SDK:
45
+
* You can deploy them to real-time or batch endpoints without providing a scoring script or an environment.
46
+
* When you deploy models, the deployments automatically have a swagger generated, and the __Test__ feature can be used in Azure Machine Learning studio.
47
+
* You can use the models directly as pipeline inputs.
48
+
* You can use the [Responsible AI dashboard](how-to-responsible-ai-dashboard.md) with your models.
MLflow adopts the MLmodel format as a way to create a contract between the artifacts and what they represent. The MLmodel format stores assets in a folder. Among them, there is a particular file named MLmodel. This file is the single source of truth about how a model can be loaded and used.
60
+
MLflow adopts the MLmodel format as a way to create a contract between the artifacts and what they represent. The MLmodel format stores assets in a folder. Among these assets, there's a file named `MLmodel`. this file is the single source of truth about how a model can be loaded and used.
61
+
62
+
The following screenshot shows a sample MLflow model's folder in the Azure Machine Learning studio. The model is placed in a folder called `credit_defaults_model`. There is no specific requirement on the naming of this folder. The folder contains the `MLmodel` file among other model artifacts.
60
63
61
-

64
+
:::image type="content" source="media/concept-mlflow-models/mlflow-mlmodel.png" alt-text="A screenshot showing assets of a sample MLflow model, including the MLmodel file." lightbox="media/concept-mlflow-models/mlflow-mlmodel.png":::
62
65
63
-
The following example shows how the `MLmodel` file for a computer version model trained with `fastai`may look like:
66
+
The following code is an example of what the `MLmodel` file for a computer vision model trained with `fastai`might look like:
64
67
65
68
__MLmodel__
66
69
@@ -88,11 +91,11 @@ signature:
88
91
}]'
89
92
```
90
93
91
-
### The model's flavors
94
+
### Model flavors
92
95
93
-
Considering the variety of machine learning frameworks available to use, MLflow introduced the concept of flavor as a way to provide a unique contract to work across all of them. A flavor indicates what to expect for a given model created with a specific framework. For instance, TensorFlow has its own flavor, which specifies how a TensorFlow model should be persisted and loaded. Because each model flavor indicates how they want to persist and load models, the MLModel format doesn't enforce a single serialization mechanism that all the models need to support. Such decision allows each flavor to use the methods that provide the best performance or best support according to their best practices - without compromising compatibility with the MLModel standard.
96
+
Considering the large number of machine learning frameworks available to use, MLflow introduced the concept of _flavor_ as a way to provide a unique contract to work across all machine learning frameworks. A flavor indicates what to expect for a given model that's created with a specific framework. For instance, TensorFlow has its own flavor, which specifies how a TensorFlow model should be persisted and loaded. Because each model flavor indicates how to persist and load the model for a given framework, the MLmodel format doesn't enforce a single serialization mechanism that all models must support. This decision allows each flavor to use the methods that provide the best performance or best support according to their best practices—without compromising compatibility with the MLmodel standard.
94
97
95
-
The following is an example of the `flavors` section for an `fastai` model.
98
+
The following code is an example of the `flavors` section for an `fastai` model.
96
99
97
100
```yaml
98
101
flavors:
@@ -106,18 +109,18 @@ flavors:
106
109
python_version: 3.8.12
107
110
```
108
111
109
-
### Signatures
112
+
### Model signature
110
113
111
-
[Model signatures in MLflow](https://www.mlflow.org/docs/latest/models.html#model-signature) are an important part of the model specification, as they serve as a data contract between the model and the server running our models. They are also important for parsing and enforcing model's input's types at deployment time. [MLflow enforces types when data is submitted to your model if a signature is available](https://www.mlflow.org/docs/latest/models.html#signature-enforcement).
114
+
A [model signature in MLflow](https://www.mlflow.org/docs/latest/models.html#model-signature) is an important part of the model's specification, as it serves as a data contract between the model and the server running the model. A model signature is also important for parsing and enforcing a model's input types at deployment time. If a signature is available, MLflow enforces input types when data is submitted to your model. For more information, see [MLflow signature enforcement](https://www.mlflow.org/docs/latest/models.html#signature-enforcement).
112
115
113
-
Signatures are indicated when the model gets logged and persisted in the `MLmodel` file, in the `signature` section. **Autolog's** feature in MLflow automatically infers signatures in a best effort way. However, it may be required to [log the models manually if the signatures inferred are not the ones you need](https://www.mlflow.org/docs/latest/models.html#how-to-log-models-with-signatures).
116
+
Signatures are indicated when models get logged, and they're persisted in the `signature` section of the `MLmodel` file. The **Autolog** feature in MLflow automatically infers signatures in a best effort way. However, you might have to log the models manually if the inferred signatures aren't the ones you need. For more information, see [How to log models with signatures](https://www.mlflow.org/docs/latest/models.html#how-to-log-models-with-signatures).
114
117
115
118
There are two types of signatures:
116
119
117
-
* **Column-based signature** corresponding to signatures that operate to tabular data. For models with this signature, MLflow supplies `pandas.DataFrame` objects as inputs.
118
-
* **Tensor-based signature:** corresponding to signatures that operate with n-dimensional arrays or tensors. For models with this signature, MLflow supplies `numpy.ndarray` as inputs (or a dictionary of `numpy.ndarray` in the case of named-tensors).
120
+
* **Column-based signature**: This signature operates on tabular data. For models with this type of signature, MLflow supplies `pandas.DataFrame` objects as inputs.
121
+
* **Tensor-based signature**: This signature operates with n-dimensional arrays or tensors. For models with this signature, MLflow supplies `numpy.ndarray` as inputs (or a dictionary of `numpy.ndarray` in the case of named-tensors).
119
122
120
-
The following example corresponds to a computer vision model trained with `fastai`. This model receives a batch of images represented as tensors of shape `(300, 300, 3)` with the RGB representation of them (unsigned integers). It outputs batches of predictions (probabilities) for two classes.
123
+
The following example corresponds to a computer vision model trained with `fastai`. This model receives a batch of images represented as tensors of shape `(300, 300, 3)` with the RGB representation of them (unsigned integers). The model outputs batches of predictions (probabilities) for two classes.
121
124
122
125
__MLmodel__
123
126
@@ -134,13 +137,13 @@ signature:
134
137
```
135
138
136
139
> [!TIP]
137
-
> Azure Machine Learning generates Swagger for model's deployment in MLflow format with a signature available. This makes easier to test deployed endpoints using the Azure Machine Learning studio.
140
+
> Azure Machine Learning generates a swagger file for a deployment of an MLflow model with a signature available. This makes it easier to test deployments using the Azure Machine Learning studio.
138
141
139
-
### Model's environment
142
+
### Model environment
140
143
141
-
Requirements for the model to run are specified in the `conda.yaml` file. Dependencies can be automatically detected by MLflow or they can be manually indicated when you call `mlflow.<flavor>.log_model()` method. The latter can be needed in cases that the libraries included in your environment are not the ones you intended to use.
144
+
Requirements for the model to run are specified in the `conda.yaml` file. MLflow can automatically detect dependencies or you can manually indicate them by calling the `mlflow.<flavor>.log_model()` method. The latter can be useful if the libraries included in your environment aren't the ones you intended to use.
142
145
143
-
The following is an example of an environment used for a model created with `fastai` framework:
146
+
The following code is an example of an environment used for a model created with the `fastai` framework:
144
147
145
148
__conda.yaml__
146
149
@@ -164,23 +167,34 @@ name: mlflow-env
164
167
```
165
168
166
169
> [!NOTE]
167
-
> MLflow environments and Azure Machine Learning environments are different concepts. While the former opperates at the level of the model, the latter operates at the level of the workspace (for registered environments) or jobs/deployments (for annonymous environments). When you deploy MLflow models in Azure Machine Learning, the model's environment is built and used for deployment. Alternatively, you can override this behaviour with the [Azure Machine Learning CLI v2](concept-v2.md) and deploy MLflow models using a specific Azure Machine Learning environments.
170
+
> __What's the difference between an MLflow environment and an Azure Machine Learning environment?__
171
+
>
172
+
> While an _MLflow environment_ operates at the level of the model, an _Azure Machine Learning environment_ operates at the level of the workspace (for registered environments) or jobs/deployments (for anonymous environments). When you deploy MLflow models in Azure Machine Learning, the model's environment is built and used for deployment. Alternatively, you can override this behavior with the [Azure Machine Learning CLI v2](concept-v2.md) and deploy MLflow models using a specific Azure Machine Learning environment.
173
+
174
+
### Predict function
168
175
169
-
### Model's predict function
176
+
All MLflow models contain a `predict` function. **This function is called when a model is deployed using a no-code-deployment experience**. What the `predict` function returns (for example, classes, probabilities, or a forecast) depend on the framework (that is, the flavor) used for training. Read the documentation of each flavor to know what they return.
170
177
171
-
All MLflow models contain a `predict` function. **This function is the one that is called when a model is deployed using a no-code-deployment experience**. What the `predict` function returns (classes, probabilities, a forecast, etc.) depend on the framework (i.e. flavor) used for training. Read the documentation of each flavor to know what they return.
178
+
In same cases, you might need to customize this `predict` function to change the way inference is executed. In such cases, you need to [log models with a different behavior in the predict method](how-to-log-mlflow-models.md#logging-models-with-a-different-behavior-in-the-predict-method) or [log a custom model's flavor](how-to-log-mlflow-models.md#logging-custom-models).
172
179
173
-
In same cases, you may need to customize this function to change the way inference is executed. On those cases, you will need to [log models with a different behavior in the predict method](how-to-log-mlflow-models.md#logging-models-with-a-different-behavior-in-the-predict-method) or [log a custom model's flavor](how-to-log-mlflow-models.md#logging-custom-models).
180
+
## Workflows for loading MLflow models
174
181
175
-
## Loading MLflow models back
182
+
You can load models that were created as MLflow models from several locations, including:
176
183
177
-
Models created as MLflow models can be loaded back directly from the run where they were logged, from the file system where they are saved or from the model registry where they are registered. MLflow provides a consistent way to load those models regardless of the location.
184
+
- directly from the run where the models were logged
185
+
- from the file system where they models are saved
186
+
- from the model registry where the models are registered.
187
+
188
+
MLflow provides a consistent way to load these models regardless of the location.
178
189
179
190
There are two workflows available for loading models:
180
191
181
-
* **Loading back the same object and types that were logged:** You can load models using MLflow SDK and obtain an instance of the model with types belonging to the training library. For instance, an ONNX model will return a `ModelProto` while a decision tree trained with Scikit-Learn model will return a `DecisionTreeClassifier` object. Use `mlflow.<flavor>.load_model()` to do so.
182
-
* **Loading back a model for running inference:** You can load models using MLflow SDK and obtain a wrapper where MLflow warranties there will be a `predict` function. It doesn't matter which flavor you are using, every MLflow model needs to implement this contract. Furthermore, MLflow warranties that this function can be called using arguments of type `pandas.DataFrame`, `numpy.ndarray` or `dict[string, numpyndarray]` (depending on the signature of the model). MLflow handles the type conversion to the input type the model actually expects. Use `mlflow.pyfunc.load_model()` to do so.
192
+
* **Load back the same object and types that were logged:** You can load models using the MLflow SDK and obtain an instance of the model with types belonging to the training library. For example, an ONNX model returns a `ModelProto` while a decision tree model trained with scikit-learn returns a `DecisionTreeClassifier` object. Use `mlflow.<flavor>.load_model()` to load back the same model object and types that were logged.
193
+
194
+
* **Load back a model for running inference:** You can load models using the MLflow SDK and obtain a wrapper where MLflow guarantees that there will be a `predict` function. It doesn't matter which flavor you're using, every MLflow model has a `predict` function. Furthermore, MLflow guarantees that this function can be called by using arguments of type `pandas.DataFrame`, `numpy.ndarray`, or `dict[string, numpyndarray]` (depending on the signature of the model). MLflow handles the type conversion to the input type that the model expects. Use `mlflow.pyfunc.load_model()` to load back a model for running inference.
183
195
184
-
## Start logging models
196
+
## Related content
185
197
186
-
We recommend starting taking advantage of MLflow models in Azure Machine Learning. There are different ways to start using the model's concept with MLflow. Read [How to log MLFlow models](how-to-log-mlflow-models.md) to a comprehensive guide.
198
+
* [Configure MLflow for Azure Machine Learning](how-to-use-mlflow-configure-tracking.md)
199
+
* [How to log MLFlow models](how-to-log-mlflow-models.md)
200
+
* [Guidelines for deploying MLflow models](how-to-deploy-mlflow-models.md)
You can inspect your model's signature by opening the MLmodel file associated with your MLflow model. For more information on how signatures work in MLflow, see [Signatures in MLflow](concept-mlflow-models.md#signatures).
57
+
You can inspect your model's signature by opening the MLmodel file associated with your MLflow model. For more information on how signatures work in MLflow, see [Signatures in MLflow](concept-mlflow-models.md#model-signature).
58
58
59
59
> [!TIP]
60
60
> Signatures in MLflow models are optional but they are highly encouraged as they provide a convenient way to early detect data compatibility issues. For more information about how to log models with signatures read [Logging models with a custom signature, environment or samples](how-to-log-mlflow-models.md#logging-models-with-a-custom-signature-environment-or-samples).
0 commit comments