You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/machine-learning/how-to-collect-production-data.md
+9-3Lines changed: 9 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -94,15 +94,15 @@ First, you'll need to add custom logging code to your scoring script (`score.py`
94
94
> [!NOTE]
95
95
> Currently, only pandas DataFrames can be logged with the `collect()`API. If the data isnotin a DataFrame when passed to `collect()`, it will not be logged to storage and an error will be reported.
96
96
97
-
The following code is an example of a full scoring script (`score.py`) that uses the custom logging Python SDK:
97
+
The following code is an example of a full scoring script (`score.py`) that uses the custom logging Python SDK. In this example, a third `Collector` called `inputs_outputs_collector` logs a joined DataFrame of the `model_inputs`and the `model_outputs`. This joined DataFrame enables additional monitoring signals (feature attribution drift, etc.). If you are not interested in those monitoring signals, please feel free to remove this `Collector`.
98
98
99
99
```python
100
100
import pandas as pd
101
101
import json
102
102
from azureml.ai.monitoring import Collector
103
103
104
104
definit():
105
-
global inputs_collector, outputs_collector
105
+
global inputs_collector, outputs_collector, inputs_outputs_collector
106
106
107
107
# instantiate collectors with appropriate names, make sure align with deployment spec
108
108
inputs_collector = Collector(name='model_inputs')
@@ -172,9 +172,11 @@ data_collector:
172
172
enabled: 'True'
173
173
model_outputs:
174
174
enabled: 'True'
175
+
model_inputs_outputs:
176
+
enabled: 'True'
175
177
```
176
178
177
-
The following code is an example of a comprehensive deployment YAML for a managed online endpoint deployment. You should update the deployment YAML according to your scenario.
179
+
The following code is an example of a comprehensive deployment YAML for a managed online endpoint deployment. You should update the deployment YAML according to your scenario. For more examples on how to format your deployment YAML for inference data logging, see [https://github.com/Azure/azureml-examples/tree/main/cli/endpoints/online/data-collector](https://github.com/Azure/azureml-examples/tree/main/cli/endpoints/online/data-collector).
## Set up model monitoring for models deployed outside of Azure Machine Learning
504
+
## Set up model monitoring by bringing your own production data to Azure Machine Learning
505
505
506
-
You can also set up model monitoring for models deployed to Azure Machine Learning batch endpoints or deployed outside of Azure Machine Learning. To monitor these models, you must meet the following requirements:
506
+
You can also set up model monitoring for models deployed to Azure Machine Learning batch endpoints or deployed outside of Azure Machine Learning. If you have production data but no deployment, you can use the data to perform continuous model monitoring. To monitor these models, you must meet the following requirements:
507
507
508
508
* You have a way to collect production inference data from models deployed in production.
509
509
* You can register the collected production inference data as an Azure Machine Learning data asset, and ensure continuous updates of the data.
@@ -516,7 +516,6 @@ You can also set up model monitoring for models deployed to Azure Machine Learni
516
516
| input | input_data | uri_folder | The collected production inference data, which is registered as Azure Machine Learning data asset. | azureml:myproduction_inference_data:1 |
517
517
| output | preprocessed_data | mltable | A tabular dataset, which matches a subset of baseline data schema. ||
518
518
519
-
520
519
# [Azure CLI](#tab/azure-cli)
521
520
522
521
Once you've satisfied the previous requirements, you can set up model monitoring with the following CLI command and YAML definition:
The studio currently doesn't support monitoring for models that are deployed outside of Azure Machine Learning. See the Azure CLI or Python tabs instead.
770
769
771
770
---
771
+
772
+
## Set up model monitoring with custom signals and metrics
773
+
774
+
With Azure Machine Learning model monitoring, you have the option to define your own custom signal and implement any metric of your choice to monitor your model. You can register this signal as an Azure Machine Learning component. When your Azure Machine Learning model monitoring job runs on the specified schedule, it will compute the metric(s) you have defined within your custom signal, just as it does for the prebuilt signals (data drift, prediction drift, data quality, & feature attribution drift). To get started with defining your own custom signal, you must meet the following requirement:
775
+
776
+
* You must define your custom signal and register it as an Azure Machine Learning component. The Azure Machine Learning component must have these input and output signatures:
777
+
778
+
### Component input signature
779
+
780
+
The component input DataFrame should contain a `mltable` with the processed data from the preprocessing component and any number of literals, each representing an implemented metric as part of the custom signal component. For example, if you have implemented one metric, `std_deviation`, then you will need an input for `std_deviation_threshold`. Generally, there should be one input per metric with the name {metric_name}_threshold.
781
+
782
+
| signature name | type | description | example value |
783
+
|---|---|---|---|
784
+
| production_data | mltable | A tabular dataset, which matches a subset of baseline data schema. ||
785
+
| std_deviation_threshold | literal, string | Respective threshold for the implemented metric. | 2 |
786
+
787
+
### Component output signature
788
+
789
+
The component output DataFrame should contain four columns: `group`, `metric_name`, `metric_value`, and `threshold_value`:
790
+
791
+
| signature name | type | description | example value |
792
+
|---|---|---|---|
793
+
| group | literal, string | Top level logical grouping to be applied to this custom metric. | TRANSACTIONAMOUNT |
794
+
| metric_name | literal, string | The name of the custom metric. | std_deviation |
795
+
| metric_value | mltable | The value of the custom metric. | 44,896.082 |
796
+
| threshold_value || The threshold for the custom metric. | 2 |
797
+
798
+
Here is an example output from a custom signal component computing the metric, `std_deviation`:
799
+
800
+
| group | metric_value | metric_name | threshold_value |
An example custom signal component definition and metric computation code can be found in our GitHub repo at [https://github.com/Azure/azureml-examples/tree/main/cli/monitoring/components/custom_signal](https://github.com/Azure/azureml-examples/tree/main/cli/monitoring/components/custom_signal).
809
+
810
+
# [Azure CLI](#tab/azure-cli)
811
+
812
+
Once you've satisfied the previous requirements, you can set up model monitoring with the following CLI command and YAML definition:
813
+
814
+
```azurecli
815
+
az ml schedule create -f ./custom-monitoring.yaml
816
+
```
817
+
818
+
The following YAML contains the definition for model monitoring with a custom signal. It is assumed that you have already created and registered your component with the custom signal definition to Azure Machine Learning. In this example, the `component_id` of the registered custom signal component is `azureml:my_custom_signal:1.0.0`:
819
+
820
+
```yaml
821
+
# custom-monitoring.yaml
822
+
$schema: http://azureml/sdk-2-0/Schedule.json
823
+
name: my-custom-signal
824
+
trigger:
825
+
type: recurrence
826
+
frequency: day # can be minute, hour, day, week, month
0 commit comments