initial work

ahughes-msft · ahughes-msft · commit dba9da4c794e · 2024-02-26T23:12:01.000-08:00
diff --git a/articles/machine-learning/concept-model-monitoring.md b/articles/machine-learning/concept-model-monitoring.md
@@ -165,6 +165,10 @@ You can use events generated by Azure Machine Learning model monitoring runs to
 
 When your model monitor detects drift, data quality issues, or model performance degradation, you can track these events with Event Grid and take action programmatically. For example, if the accuracy of your classification model in production dips below a certain threshold, you can use Event Grid to begin a retraining job that uses collected ground truth data. To learn how to integrate Azure Machine Learning with Event Grid, see [Perform continuous model monitoring in Azure Machine Learning](how-to-monitor-model-performance.md).
 
+## Model monitoring limitations
+
+Azure Machine Learning model monitoring only supports accessing data contained within datastores with credential-based authentication (e.g., SAS token). To learn more about datastores and authentication modes, see [Data administration](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-administrate-data-authentication?view=azureml-api-2).
+
 ## Related content
 
 - [Model data collection](concept-data-collection.md)
diff --git a/articles/machine-learning/how-to-collect-production-data.md b/articles/machine-learning/how-to-collect-production-data.md
@@ -279,7 +279,7 @@ The collected data follows the following JSON schema. The collected data is avai
 
 #### Store large payloads
 
-If the payload of your data is greater than 256 KB, there will be an event in the `{instance_id}.jsonl` file contained within the `{endpoint_name}/{deployment_name}/request/.../{instance_id}.jsonl` path that points to a raw file path, which should have the following path: `blob_url/{blob_container}/{blob_path}/{endpoint_name}/{deployment_name}/{rolled_time}/{instance_id}.jsonl`. The collected data will exist at this path.
+If the payload of your data is greater than 4 MB, there will be an event in the `{instance_id}.jsonl` file contained within the `{endpoint_name}/{deployment_name}/request/.../{instance_id}.jsonl` path that points to a raw file path, which should have the following path: `blob_url/{blob_container}/{blob_path}/{endpoint_name}/{deployment_name}/{rolled_time}/{instance_id}.jsonl`. The collected data will exist at this path.
 
 #### Store binary data
 
@@ -305,7 +305,7 @@ With collected binary data, we show the raw file directly, with `instance_id` as
 
 To view the collected data in Blob Storage from the studio UI:
 
-1. Go to thee **Data** tab in your Azure Machine Learning workspace:
+1. Go to the **Data** tab in your Azure Machine Learning workspace:
 
     :::image type="content" source="./media/how-to-collect-production-data/datastores.png" alt-text="Screenshot highlights Data page in Azure Machine Learning workspace" lightbox="media/how-to-collect-production-data/datastores.png":::
 
diff --git a/articles/machine-learning/how-to-monitor-model-performance.md b/articles/machine-learning/how-to-monitor-model-performance.md
@@ -417,17 +417,40 @@ To set up advanced monitoring:
 
 ## Set up model performance monitoring
 
-Azure Machine Learning model monitoring enables you to track the objective performance of your models in production by calculating model performance metrics. These model performance metrics include accuracy (for classification models) and root mean squared error (RMSE) for (regression models).
+Azure Machine Learning model monitoring enables you to track the performance of your models in 
+production by calculating model performance metrics. The following set of model performance metrics are currently supported:
+
+Classification models:
+- Precision
+- Accuracy
+- Recall
+
+Regression models:
+- Mean Absolute Error (MAE)
+- Mean Squared Error (MSE)
+- Root Mean Squared Error (RMSE)
 
 Before you can configure your model performance signal, you need to satisfy the following requirements:
 
-* Have production model output data (your model's predictions) with a unique ID for each row.
-* Have ground truth data (or actuals) with a unique ID for each row. This data will be joined with production data.
-* (Optional) Have a prejoined dataset with model outputs and ground truth data.
+* Have production model output data (your model's predictions) with a unique ID for each row. If you collect your data with the [Azure Machine Learning data collector](how-to-collect-production-data.md), a `correlation_id` is provided for each inference request for you. With the data collector, you also have the option to log your own unique ID from your application.
+
+    > [!NOTE]
+    >
+    > For Azure Machine Learning model performance monitoring, we recommend logging your unique ID as it's own column using the [Azure Machine Learning data collector](how-to-collect-production-data.md). This will ensure that each collected row is guaranteed to have a unique ID. 
+
+* Have ground truth data (actuals) with a unique ID for each row. The unique ID for a given row should match the unique ID for the model outputs for that particular inference request. This unique ID will be used to join your ground truth dataset with the model outputs.
+
+* (Optional) Have a pre-joined tabular dataset with model outputs and ground truth data already joined together.
 
 The key requirement for enabling model performance monitoring is that you already collected ground truth data. Since ground truth data is encountered at the application level, it's your responsibility to collect it as it becomes available. You should also maintain a data asset in Azure Machine Learning with this ground truth data.
 
-To illustrate, suppose you have a deployed model to predict if a credit card transaction is fraudulent or not fraudulent. As you use this model in production, you can collect the model output data with the [model data collector](how-to-collect-production-data.md). Ground truth data becomes available when a credit card holder specifies whether or not the transaction was fraudulent or not. This `is_fraud` ground truth should be collected at the application level and maintained within an Azure Machine Learning data asset that the model performance monitoring signal can use.
+### Example scenario
+
+Here is an example scenario to help you understand the concepts associated with model performance monitoring:
+
+Support you have deployed a model to predict if credit card transactions are fraudulent or not. As part of your deployment, you use the [data collector](how-to-collect-production-data.md) to collect production model input and model output data. You log a unique ID for each row (this unique ID can come from your application, or you can use our generated `correlationid`, which is unique for each logged JSON object). When ground truth data becomes available, it is logged and mapped to the same unique ID which was logged with the model outputs. This `is_fraud` data is collected, maintained, and registered to Azure Machine Learning as a data asset. Then, a model performance monitoring signal can be created to join the model outputs and ground truth datasets on the unique ID which was logged. Lastly, model performance metrics are computed.
+
+Alternatively, 
 
 # [Azure CLI](#tab/azure-cli)
 
@@ -628,7 +651,43 @@ created_monitor = poller.result()
 
 # [Studio](#tab/azure-studio)
 
-The studio currently doesn't support model performance monitoring.
+To set up model performance monitoring:
+
+1. Complete the entires on the **Basic settings** page as described earlier in the [Set up out-of-box model monitoring](#set-up-out-of-box-model-monitoring) section.
+1. Select **Next** to open the **Configure data asset** page of the **Advanced settings** section.
+1. **Add** a dataset to be used as the ground truth dataset. Ensure that your model outputs is also included. The ground truth dataset you add should have a unique ID for each row which matches a unique ID for each row in the model outputs. This is necessary for them to be joined together prior to metric computation.
+
+   NOTE: This screenshot should show adding model_outputs and the ground truth data asset
+
+   :::image type="content" source="media/how-to-monitor-models/model-monitoring-advanced-config-data.png" alt-text="Screenshot showing how to add datasets for the monitoring signals to use." lightbox="media/how-to-monitor-models/model-monitoring-advanced-config-data.png":::
+
+1. Select **Next** to go to the **Select monitoring signals** page. On this page, you will see some monitoring signals already added (if you selected an Azure Machine Learning online deployment earlier).
+1. Select **Add** to open the **Edit Signal** window.
+1. Select **Model performance (preview)** to configure the model performance signal as follows:
+
+    1. Select the production data asset with your model outputs and the desired lookback window size and lookback window offset. Select the appropriate target column (for example, `is_fraud`).
+    1. Select the reference data asset, which should be the ground truth data asset you added earlier. Select the appropriate target_column. Select column to join with the model outputs. This column should be the column which is common between the two datasets and is a unique ID for each for (for example, `correlationid`).
+    1. Select your desired performance metrics and the respective thresholds.
+    
+    NOTE: This screenshot should show a fully configured model performance view
+
+   :::image type="content" source="media/how-to-monitor-models/model-monitoring-configure-feature-attribution-drift.png" alt-text="Screenshot showing how to configure feature attribution drift signal." lightbox="media/how-to-monitor-models/model-monitoring-configure-feature-attribution-drift.png":::
+
+1. Select **Save** to return to the **Select monitoring signals** page.
+
+    NOTE: This screenshot should show the model performance signal configured
+
+    :::image type="content" source="media/how-to-monitor-models/model-monitoring-configured-signals.png" alt-text="Screenshot showing the configured signals." lightbox="media/how-to-monitor-models/model-monitoring-configured-signals.png":::
+
+1. When you're finished with your monitoring signals configuration, select **Next** to go to the **Notifications** page.
+1. On the **Notifications** page, enable alert notifications for each signal and select **Next**.
+1. Review your settings on the **Review monitoring settings** page.
+
+   NOTE: This screenshot should show the review page with model performance added
+
+   :::image type="content" source="media/how-to-monitor-models/model-monitoring-advanced-config-review.png" alt-text="Screenshot showing review page of the advanced configuration for model monitoring." lightbox="media/how-to-monitor-models/model-monitoring-advanced-config-review.png":::
+
+1. Select **Create** to create your model performance monitor.
 
 ---
 
diff --git a/articles/machine-learning/reference-yaml-monitor.md b/articles/machine-learning/reference-yaml-monitor.md
@@ -7,9 +7,9 @@ ms.service: machine-learning
 ms.subservice: mlops
 ms.topic: reference
 ms.custom: cliv2
-author: bozhong68
-ms.author: bozhlin
-ms.date: 09/15/2023
+author: ahughes-msft
+ms.author: alehughes
+ms.date: 02/26/2024
 ms.reviewer: mopeakande
 reviewer: msakande
 ---
@@ -91,7 +91,6 @@ Recurrence schedule defines the recurrence pattern, containing `hours`, `minutes
 
 As the data used to train the model evolves in production, the distribution of the data can shift, resulting in a mismatch between the training data and the real-world data that the model is being used to predict. Data drift is a phenomenon that occurs in machine learning when the statistical properties of the input data used to train the model change over time.
 
-
 | Key | Type | Description | Allowed values | Default value |
 | --- | ---- | ------------| ---------------| ------------- |
 | `type` | String | **Required**. Type of monitoring signal. Prebuilt monitoring signal processing component is automatically loaded according to the `type` specified here. | `data_drift` | `data_drift` |
@@ -112,7 +111,6 @@ As the data used to train the model evolves in production, the distribution of t
 | `metric_thresholds.numerical` | Object | Optional. List of metrics and thresholds in `key:value` format, `key` is the metric name, `value` is the threshold. | Allowed numerical metric names: `jensen_shannon_distance`, `normalized_wasserstein_distance`, `population_stability_index`, `two_sample_kolmogorov_smirnov_test`| |
 | `metric_thresholds.categorical` | Object | Optional. List of metrics and thresholds in 'key:value' format, 'key' is the metric name, 'value' is the threshold. | Allowed `categorical` metric names: `jensen_shannon_distance`, `chi_squared_test`, `population_stability_index`| |
 
-
 #### Prediction drift
 
 Prediction drift tracks changes in the distribution of a model's prediction outputs by comparing it to validation or test labeled data or recent past production data.
@@ -136,7 +134,6 @@ Prediction drift tracks changes in the distribution of a model's prediction outp
 | `metric_thresholds.numerical` | Object | Optional. List of metrics and thresholds in `key:value` format, `key` is the metric name, `value` is the threshold. | Allowed numerical metric names: `jensen_shannon_distance`, `normalized_wasserstein_distance`, `population_stability_index`, `two_sample_kolmogorov_smirnov_test`| |
 | `metric_thresholds.categorical` | Object | Optional. List of metrics and thresholds in `key:value` format, `key` is the metric name, `value` is the threshold. | Allowed `categorical` metric names: `jensen_shannon_distance`, `chi_squared_test`, `population_stability_index`| |
 
-
 #### Data quality
 
 Data quality signal tracks data quality issues in production by comparing to training data or recent past production data.
@@ -161,7 +158,7 @@ Data quality signal tracks data quality issues in production by comparing to tra
 | `metric_thresholds.numerical` | Object | **Optional** List of metrics and thresholds in `key:value` format, `key` is the metric name, `value` is the threshold. | Allowed numerical metric names: `data_type_error_rate`, `null_value_rate`, `out_of_bounds_rate`| |
 | `metric_thresholds.categorical` | Object | **Optional** List of metrics and thresholds in `key:value` format, `key` is the metric name, `value` is the threshold. | Allowed `categorical` metric names: `data_type_error_rate`, `null_value_rate`, `out_of_bounds_rate`| |
 
-#### Feature attribution drift
+#### Feature attribution drift (preview)
 
 The feature attribution of a model may change over time due to changes in the distribution of data, changes in the relationships between features, or changes in the underlying problem being solved. Feature attribution drift is a phenomenon that occurs in machine learning models when the importance or contribution of features to the prediction output changes over time.
 
@@ -182,27 +179,29 @@ The feature attribution of a model may change over time due to changes in the di
 | `alert_enabled` | Boolean | Turn on/off alert notification for the monitoring signal. `True` or `False` | | |
 | `metric_thresholds` | Object | Metric name and threshold for feature attribution drift in `key:value` format, where `key` is the metric name, and `value` is the threshold. When threshold is exceeded and `alert_enabled` is on, user will receive alert notification. | Allowed metric name: `normalized_discounted_cumulative_gain` | |
 
+#### Model performance (preview)
 
+The feature attribution of a model may change over time due to changes in the distribution of data, changes in the relationships between features, or changes in the underlying problem being solved. Feature attribution drift is a phenomenon that occurs in machine learning models when the importance or contribution of features to the prediction output changes over time.
 
 ## Remarks
 
 The `az ml schedule` command can be used for managing Azure Machine Learning models.
 
 ## Examples
 
-Examples are available in the [examples GitHub repository](https://github.com/Azure/azureml-examples/tree/main/cli/schedules). A couple are as follows:
+Monitoring CLI examples are available in the [examples GitHub repository](https://github.com/Azure/azureml-examples/tree/main/cli/monitoring). A couple are as follows:
 
-## YAML: Schedule with recurrence pattern
+## YAML: Out-of-box monitor
 
 [!INCLUDE [cli v2](includes/machine-learning-cli-v2.md)]
 
-:::code language="yaml" source="~/azureml-examples-main/cli/schedules/recurrence-job-schedule.yml":::
+:::code language="yaml" source="~/azureml-examples-main/cli/monitoring/out-of-box-monitoring.yaml":::
 
-## YAML: Schedule with cron expression
+## YAML: Advanced monitor
 
 [!INCLUDE [cli v2](includes/machine-learning-cli-v2.md)]
 
-:::code language="yaml" source="~/azureml-examples-main/cli/schedules/cron-job-schedule.yml":::
+:::code language="yaml" source="~/azureml-examples-main/cli/monitoring/advanced-model-monitoring.yaml":::
 
 ## Appendix
 
@@ -342,5 +341,4 @@ Current schedule supports the following timezones. The key can be used directly
 | UTC +12:45  | CHATHAM_ISLANDS_STANDARD_TIME   | "Chatham Islands Standard Time"   |
 | UTC +13:00  | TONGA__STANDARD_TIME            | "Tonga Standard Time"             |
 | UTC +13:00  | SAMOA_STANDARD_TIME             | "Samoa Standard Time"             |
-| UTC +14:00  | LINE_ISLANDS_STANDARD_TIME      | "Line Islands Standard Time"      |
-
+| UTC +14:00  | LINE_ISLANDS_STANDARD_TIME      | "Line Islands Standard Time"      |