Skip to content

Commit 306cf31

Browse files
authored
Merge pull request #261723 from msakande/freshness-for-concept-mlflow-models-article
freshness review: mlflow models concept article
2 parents 61d63c6 + 9a9b42b commit 306cf31

File tree

5 files changed

+58
-44
lines changed

5 files changed

+58
-44
lines changed

articles/machine-learning/concept-mlflow-models.md

Lines changed: 55 additions & 41 deletions
Original file line numberDiff line numberDiff line change
@@ -1,31 +1,32 @@
11
---
22
title: From artifacts to models in MLflow
33
titleSuffix: Azure Machine Learning
4-
description: Learn about how MLflow uses the concept of models instead of artifacts to represent your trained models and enable a streamlined path to deployment.
4+
description: Learn how MLflow uses the concept of models instead of artifacts to represent your trained models and enable a streamlined path to deployment.
55
services: machine-learning
66
author: santiagxf
77
ms.author: fasantia
88
ms.reviewer: mopeakande
9+
reviewer: msakande
910
ms.service: machine-learning
1011
ms.subservice: mlops
11-
ms.date: 11/04/2022
12+
ms.date: 12/20/2023
1213
ms.topic: conceptual
1314
ms.custom: cliv2, sdkv2
1415
---
1516

1617
# From artifacts to models in MLflow
1718

18-
The following article explains the differences between an artifact and a model in MLflow and how to transition from one to the other. It also explains how Azure Machine Learning uses the MLflow model's concept to enabled streamlined deployment workflows.
19+
The following article explains the differences between an MLflow artifact and an MLflow model, and how to transition from one to the other. It also explains how Azure Machine Learning uses the concept of an MLflow model to enable streamlined deployment workflows.
1920

2021
## What's the difference between an artifact and a model?
2122

22-
If you are not familiar with MLflow, you may not be aware of the difference between logging artifacts or files vs. logging MLflow models. There are some fundamental differences between the two:
23+
If you're not familiar with MLflow, you might not be aware of the difference between logging artifacts or files vs. logging MLflow models. There are some fundamental differences between the two:
2324

24-
### Artifacts
25+
### Artifact
2526

26-
Any file generated (and captured) from an experiment's run or job is an artifact. It may represent a model serialized as a Pickle file, the weights of a PyTorch or TensorFlow model, or even a text file containing the coefficients of a linear regression. Other artifacts can have nothing to do with the model itself, but they can contain configuration to run the model, pre-processing information, sample data, etc. As you can see, an artifact can come in any format.
27+
An _artifact_ is any file that's generated (and captured) from an experiment's run or job. An artifact could represent a model serialized as a pickle file, the weights of a PyTorch or TensorFlow model, or even a text file containing the coefficients of a linear regression. Some artifacts could also have nothing to do with the model itself; rather, they could contain configurations to run the model, or preprocessing information, or sample data, and so on. Artifacts can come in various formats.
2728

28-
You may have been logging artifacts already:
29+
You might have been logging artifacts already:
2930

3031
```python
3132
filename = 'model.pkl'
@@ -35,18 +36,18 @@ with open(filename, 'wb') as f:
3536
mlflow.log_artifact(filename)
3637
```
3738

38-
### Models
39+
### Model
3940

40-
A model in MLflow is also an artifact. However, we make stronger assumptions about this type of artifacts. Such assumptions provide a clear contract between the saved files and what they mean. When you log your models as artifacts (simple files), you need to know what the model builder meant for each of them in order to know how to load the model for inference. On the contrary, MLflow models can be loaded using the contract specified in the [The MLModel format](concept-mlflow-models.md#the-mlmodel-format).
41+
A _model_ in MLflow is also an artifact. However, we make stronger assumptions about this type of artifact. Such assumptions provide a clear contract between the saved files and what they mean. When you log your models as artifacts (simple files), you need to know what the model builder meant for each of those files so as to know how to load the model for inference. On the contrary, MLflow models can be loaded using the contract specified in the [The MLmodel format](concept-mlflow-models.md#the-mlmodel-format).
4142

4243
In Azure Machine Learning, logging models has the following advantages:
43-
> [!div class="checklist"]
44-
> * You can deploy them on real-time or batch endpoints without providing an scoring script nor an environment.
45-
> * When deployed, Model's deployments have a Swagger generated automatically and the __Test__ feature can be used in Azure Machine Learning studio.
46-
> * Models can be used as pipelines inputs directly.
47-
> * You can use the [Responsible AI dashbord (preview)](how-to-responsible-ai-dashboard.md).
4844

49-
Models can get logged by using MLflow SDK:
45+
* You can deploy them to real-time or batch endpoints without providing a scoring script or an environment.
46+
* When you deploy models, the deployments automatically have a swagger generated, and the __Test__ feature can be used in Azure Machine Learning studio.
47+
* You can use the models directly as pipeline inputs.
48+
* You can use the [Responsible AI dashboard](how-to-responsible-ai-dashboard.md) with your models.
49+
50+
You can log models by using the MLflow SDK:
5051

5152
```python
5253
import mlflow
@@ -56,11 +57,13 @@ mlflow.sklearn.log_model(sklearn_estimator, "classifier")
5657

5758
## The MLmodel format
5859

59-
MLflow adopts the MLmodel format as a way to create a contract between the artifacts and what they represent. The MLmodel format stores assets in a folder. Among them, there is a particular file named MLmodel. This file is the single source of truth about how a model can be loaded and used.
60+
MLflow adopts the MLmodel format as a way to create a contract between the artifacts and what they represent. The MLmodel format stores assets in a folder. Among these assets, there's a file named `MLmodel`. this file is the single source of truth about how a model can be loaded and used.
61+
62+
The following screenshot shows a sample MLflow model's folder in the Azure Machine Learning studio. The model is placed in a folder called `credit_defaults_model`. There is no specific requirement on the naming of this folder. The folder contains the `MLmodel` file among other model artifacts.
6063

61-
![a sample MLflow model in MLmodel format](media/concept-mlflow-models/mlflow-mlmodel.png)
64+
:::image type="content" source="media/concept-mlflow-models/mlflow-mlmodel.png" alt-text="A screenshot showing assets of a sample MLflow model, including the MLmodel file." lightbox="media/concept-mlflow-models/mlflow-mlmodel.png":::
6265

63-
The following example shows how the `MLmodel` file for a computer version model trained with `fastai` may look like:
66+
The following code is an example of what the `MLmodel` file for a computer vision model trained with `fastai` might look like:
6467

6568
__MLmodel__
6669

@@ -88,11 +91,11 @@ signature:
8891
}]'
8992
```
9093
91-
### The model's flavors
94+
### Model flavors
9295
93-
Considering the variety of machine learning frameworks available to use, MLflow introduced the concept of flavor as a way to provide a unique contract to work across all of them. A flavor indicates what to expect for a given model created with a specific framework. For instance, TensorFlow has its own flavor, which specifies how a TensorFlow model should be persisted and loaded. Because each model flavor indicates how they want to persist and load models, the MLModel format doesn't enforce a single serialization mechanism that all the models need to support. Such decision allows each flavor to use the methods that provide the best performance or best support according to their best practices - without compromising compatibility with the MLModel standard.
96+
Considering the large number of machine learning frameworks available to use, MLflow introduced the concept of _flavor_ as a way to provide a unique contract to work across all machine learning frameworks. A flavor indicates what to expect for a given model that's created with a specific framework. For instance, TensorFlow has its own flavor, which specifies how a TensorFlow model should be persisted and loaded. Because each model flavor indicates how to persist and load the model for a given framework, the MLmodel format doesn't enforce a single serialization mechanism that all models must support. This decision allows each flavor to use the methods that provide the best performance or best support according to their best practiceswithout compromising compatibility with the MLmodel standard.
9497
95-
The following is an example of the `flavors` section for an `fastai` model.
98+
The following code is an example of the `flavors` section for an `fastai` model.
9699

97100
```yaml
98101
flavors:
@@ -106,18 +109,18 @@ flavors:
106109
python_version: 3.8.12
107110
```
108111

109-
### Signatures
112+
### Model signature
110113

111-
[Model signatures in MLflow](https://www.mlflow.org/docs/latest/models.html#model-signature) are an important part of the model specification, as they serve as a data contract between the model and the server running our models. They are also important for parsing and enforcing model's input's types at deployment time. [MLflow enforces types when data is submitted to your model if a signature is available](https://www.mlflow.org/docs/latest/models.html#signature-enforcement).
114+
A [model signature in MLflow](https://www.mlflow.org/docs/latest/models.html#model-signature) is an important part of the model's specification, as it serves as a data contract between the model and the server running the model. A model signature is also important for parsing and enforcing a model's input types at deployment time. If a signature is available, MLflow enforces input types when data is submitted to your model. For more information, see [MLflow signature enforcement](https://www.mlflow.org/docs/latest/models.html#signature-enforcement).
112115

113-
Signatures are indicated when the model gets logged and persisted in the `MLmodel` file, in the `signature` section. **Autolog's** feature in MLflow automatically infers signatures in a best effort way. However, it may be required to [log the models manually if the signatures inferred are not the ones you need](https://www.mlflow.org/docs/latest/models.html#how-to-log-models-with-signatures).
116+
Signatures are indicated when models get logged, and they're persisted in the `signature` section of the `MLmodel` file. The **Autolog** feature in MLflow automatically infers signatures in a best effort way. However, you might have to log the models manually if the inferred signatures aren't the ones you need. For more information, see [How to log models with signatures](https://www.mlflow.org/docs/latest/models.html#how-to-log-models-with-signatures).
114117

115118
There are two types of signatures:
116119

117-
* **Column-based signature** corresponding to signatures that operate to tabular data. For models with this signature, MLflow supplies `pandas.DataFrame` objects as inputs.
118-
* **Tensor-based signature:** corresponding to signatures that operate with n-dimensional arrays or tensors. For models with this signature, MLflow supplies `numpy.ndarray` as inputs (or a dictionary of `numpy.ndarray` in the case of named-tensors).
120+
* **Column-based signature**: This signature operates on tabular data. For models with this type of signature, MLflow supplies `pandas.DataFrame` objects as inputs.
121+
* **Tensor-based signature**: This signature operates with n-dimensional arrays or tensors. For models with this signature, MLflow supplies `numpy.ndarray` as inputs (or a dictionary of `numpy.ndarray` in the case of named-tensors).
119122

120-
The following example corresponds to a computer vision model trained with `fastai`. This model receives a batch of images represented as tensors of shape `(300, 300, 3)` with the RGB representation of them (unsigned integers). It outputs batches of predictions (probabilities) for two classes.
123+
The following example corresponds to a computer vision model trained with `fastai`. This model receives a batch of images represented as tensors of shape `(300, 300, 3)` with the RGB representation of them (unsigned integers). The model outputs batches of predictions (probabilities) for two classes.
121124

122125
__MLmodel__
123126

@@ -134,13 +137,13 @@ signature:
134137
```
135138

136139
> [!TIP]
137-
> Azure Machine Learning generates Swagger for model's deployment in MLflow format with a signature available. This makes easier to test deployed endpoints using the Azure Machine Learning studio.
140+
> Azure Machine Learning generates a swagger file for a deployment of an MLflow model with a signature available. This makes it easier to test deployments using the Azure Machine Learning studio.
138141

139-
### Model's environment
142+
### Model environment
140143

141-
Requirements for the model to run are specified in the `conda.yaml` file. Dependencies can be automatically detected by MLflow or they can be manually indicated when you call `mlflow.<flavor>.log_model()` method. The latter can be needed in cases that the libraries included in your environment are not the ones you intended to use.
144+
Requirements for the model to run are specified in the `conda.yaml` file. MLflow can automatically detect dependencies or you can manually indicate them by calling the `mlflow.<flavor>.log_model()` method. The latter can be useful if the libraries included in your environment aren't the ones you intended to use.
142145

143-
The following is an example of an environment used for a model created with `fastai` framework:
146+
The following code is an example of an environment used for a model created with the `fastai` framework:
144147

145148
__conda.yaml__
146149

@@ -164,23 +167,34 @@ name: mlflow-env
164167
```
165168

166169
> [!NOTE]
167-
> MLflow environments and Azure Machine Learning environments are different concepts. While the former opperates at the level of the model, the latter operates at the level of the workspace (for registered environments) or jobs/deployments (for annonymous environments). When you deploy MLflow models in Azure Machine Learning, the model's environment is built and used for deployment. Alternatively, you can override this behaviour with the [Azure Machine Learning CLI v2](concept-v2.md) and deploy MLflow models using a specific Azure Machine Learning environments.
170+
> __What's the difference between an MLflow environment and an Azure Machine Learning environment?__
171+
>
172+
> While an _MLflow environment_ operates at the level of the model, an _Azure Machine Learning environment_ operates at the level of the workspace (for registered environments) or jobs/deployments (for anonymous environments). When you deploy MLflow models in Azure Machine Learning, the model's environment is built and used for deployment. Alternatively, you can override this behavior with the [Azure Machine Learning CLI v2](concept-v2.md) and deploy MLflow models using a specific Azure Machine Learning environment.
173+
174+
### Predict function
168175

169-
### Model's predict function
176+
All MLflow models contain a `predict` function. **This function is called when a model is deployed using a no-code-deployment experience**. What the `predict` function returns (for example, classes, probabilities, or a forecast) depend on the framework (that is, the flavor) used for training. Read the documentation of each flavor to know what they return.
170177

171-
All MLflow models contain a `predict` function. **This function is the one that is called when a model is deployed using a no-code-deployment experience**. What the `predict` function returns (classes, probabilities, a forecast, etc.) depend on the framework (i.e. flavor) used for training. Read the documentation of each flavor to know what they return.
178+
In same cases, you might need to customize this `predict` function to change the way inference is executed. In such cases, you need to [log models with a different behavior in the predict method](how-to-log-mlflow-models.md#logging-models-with-a-different-behavior-in-the-predict-method) or [log a custom model's flavor](how-to-log-mlflow-models.md#logging-custom-models).
172179

173-
In same cases, you may need to customize this function to change the way inference is executed. On those cases, you will need to [log models with a different behavior in the predict method](how-to-log-mlflow-models.md#logging-models-with-a-different-behavior-in-the-predict-method) or [log a custom model's flavor](how-to-log-mlflow-models.md#logging-custom-models).
180+
## Workflows for loading MLflow models
174181

175-
## Loading MLflow models back
182+
You can load models that were created as MLflow models from several locations, including:
176183

177-
Models created as MLflow models can be loaded back directly from the run where they were logged, from the file system where they are saved or from the model registry where they are registered. MLflow provides a consistent way to load those models regardless of the location.
184+
- directly from the run where the models were logged
185+
- from the file system where they models are saved
186+
- from the model registry where the models are registered.
187+
188+
MLflow provides a consistent way to load these models regardless of the location.
178189

179190
There are two workflows available for loading models:
180191

181-
* **Loading back the same object and types that were logged:** You can load models using MLflow SDK and obtain an instance of the model with types belonging to the training library. For instance, an ONNX model will return a `ModelProto` while a decision tree trained with Scikit-Learn model will return a `DecisionTreeClassifier` object. Use `mlflow.<flavor>.load_model()` to do so.
182-
* **Loading back a model for running inference:** You can load models using MLflow SDK and obtain a wrapper where MLflow warranties there will be a `predict` function. It doesn't matter which flavor you are using, every MLflow model needs to implement this contract. Furthermore, MLflow warranties that this function can be called using arguments of type `pandas.DataFrame`, `numpy.ndarray` or `dict[string, numpyndarray]` (depending on the signature of the model). MLflow handles the type conversion to the input type the model actually expects. Use `mlflow.pyfunc.load_model()` to do so.
192+
* **Load back the same object and types that were logged:** You can load models using the MLflow SDK and obtain an instance of the model with types belonging to the training library. For example, an ONNX model returns a `ModelProto` while a decision tree model trained with scikit-learn returns a `DecisionTreeClassifier` object. Use `mlflow.<flavor>.load_model()` to load back the same model object and types that were logged.
193+
194+
* **Load back a model for running inference:** You can load models using the MLflow SDK and obtain a wrapper where MLflow guarantees that there will be a `predict` function. It doesn't matter which flavor you're using, every MLflow model has a `predict` function. Furthermore, MLflow guarantees that this function can be called by using arguments of type `pandas.DataFrame`, `numpy.ndarray`, or `dict[string, numpyndarray]` (depending on the signature of the model). MLflow handles the type conversion to the input type that the model expects. Use `mlflow.pyfunc.load_model()` to load back a model for running inference.
183195

184-
## Start logging models
196+
## Related content
185197

186-
We recommend starting taking advantage of MLflow models in Azure Machine Learning. There are different ways to start using the model's concept with MLflow. Read [How to log MLFlow models](how-to-log-mlflow-models.md) to a comprehensive guide.
198+
* [Configure MLflow for Azure Machine Learning](how-to-use-mlflow-configure-tracking.md)
199+
* [How to log MLFlow models](how-to-log-mlflow-models.md)
200+
* [Guidelines for deploying MLflow models](how-to-deploy-mlflow-models.md)

articles/machine-learning/how-to-deploy-mlflow-models.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -54,7 +54,7 @@ __MLmodel__
5454

5555
:::code language="yaml" source="~/azureml-examples-main/sdk/python/endpoints/online/mlflow/sklearn-diabetes/model/MLmodel" highlight="13-19":::
5656

57-
You can inspect your model's signature by opening the MLmodel file associated with your MLflow model. For more information on how signatures work in MLflow, see [Signatures in MLflow](concept-mlflow-models.md#signatures).
57+
You can inspect your model's signature by opening the MLmodel file associated with your MLflow model. For more information on how signatures work in MLflow, see [Signatures in MLflow](concept-mlflow-models.md#model-signature).
5858

5959
> [!TIP]
6060
> Signatures in MLflow models are optional but they are highly encouraged as they provide a convenient way to early detect data compatibility issues. For more information about how to log models with signatures read [Logging models with a custom signature, environment or samples](how-to-log-mlflow-models.md#logging-models-with-a-custom-signature-environment-or-samples).

0 commit comments

Comments
 (0)