change MDC formatting

ahughes-msft · ahughes-msft · commit bb3a76b1e01a · 2024-03-27T16:03:18.000-07:00
diff --git a/articles/machine-learning/how-to-collect-production-data.md b/articles/machine-learning/how-to-collect-production-data.md
@@ -26,7 +26,6 @@ You can enable data collection for new or existing online endpoint deployments.
 
 If you're interested in collecting production inference data for an MLflow model that is deployed to a real-time endpoint, see [Data collection for MLflow models](#collect-data-for-mlflow-models).
 
-
 ## Prerequisites
 
 # [Azure CLI](#tab/azure-cli)
@@ -206,7 +205,7 @@ def predict(input_df):
   return output_df
 ```
 
-### Collect data for model performance monitoring
+#### Collect data for model performance monitoring
 
 If you want to use your collected data for model performance monitoring, it's important that each logged row has a unique `correlationid` that can be used to correlate the data with ground truth data, when such data becomes available. The data collector will autogenerate a unique `correlationid` for each logged row and include this autogenerated ID in the `correlationid` field in the JSON object. For more information on the JSON schema, see [store collected data in a blob](#store-collected-data-in-a-blob).
 
@@ -322,7 +321,68 @@ For more information on how to format your deployment YAML for data collection w
 
 For more information on how to format your deployment YAML for data collection with managed online endpoints, see [CLI (v2) managed online deployment YAML schema](reference-yaml-deployment-managed-online.md).
 
-### Store collected data in a blob
+## Perform payload logging
+
+In addition to custom logging with the provided Python SDK, you can collect request and response HTTP payload data directly without the need to augment your scoring script (`score.py`).
+
+1. To enable payload logging, in your deployment YAML, use the names `request` and `response`:
+
+    ```yml
+    $schema: http://azureml/sdk-2-0/OnlineDeployment.json
+    
+    endpoint_name: my_endpoint 
+    name: blue 
+    model: azureml:my-model-m1:1 
+    environment: azureml:env-m1:1 
+    data_collector:
+       collections:
+           request:
+               enabled: 'True'
+           response:
+               enabled: 'True'
+    ```
+
+1. Deploy the model with payload logging enabled:
+
+    ```bash
+    $ az ml online-deployment create -f deployment.YAML
+    ```
+
+With payload logging, the collected data is not guaranteed to be in tabular format. Therefore, if you want to use collected payload data with model monitoring, you'll be required to provide a preprocessing component to make the data tabular. If you're interested in a seamless model monitoring experience, we recommend using the [custom logging Python SDK](#perform-custom-logging-for-model-monitoring).
+
+As your deployment is used, the collected data flows to your workspace Blob storage. The following JSON code is an example of an HTTP _request_ collected:
+
+```json
+{"specversion":"1.0",
+"id":"19790b87-a63c-4295-9a67-febb2d8fbce0",
+"source":"/subscriptions/d511f82f-71ba-49a4-8233-d7be8a3650f4/resourceGroups/mire2etesting/providers/Microsoft.MachineLearningServices/workspaces/mirmasterenvws/onlineEndpoints/localdev-endpoint/deployments/localdev",
+"type":"azureml.inference.request",
+"datacontenttype":"application/json",
+"time":"2022-05-25T08:59:48Z",
+"data":{"data": [  [1,2,3,4,5,6,7,8,9,10], [10,9,8,7,6,5,4,3,2,1]]},
+"path":"/score",
+"method":"POST",
+"contentrange":"bytes 0-59/*",
+"correlationid":"f6e806c9-1a9a-446b-baa2-901373162105","xrequestid":"f6e806c9-1a9a-446b-baa2-901373162105"}
+```
+
+And the following JSON code is another example of an HTTP _response_ collected:
+
+```json
+{"specversion":"1.0",
+"id":"bbd80e51-8855-455f-a719-970023f41e7d",
+"source":"/subscriptions/d511f82f-71ba-49a4-8233-d7be8a3650f4/resourceGroups/mire2etesting/providers/Microsoft.MachineLearningServices/workspaces/mirmasterenvws/onlineEndpoints/localdev-endpoint/deployments/localdev",
+"type":"azureml.inference.response",
+"datacontenttype":"application/json",
+"time":"2022-05-25T08:59:48Z",
+"data":[11055.977245525679, 4503.079536107787],
+"contentrange":"bytes 0-38/39",
+"correlationid":"f6e806c9-1a9a-446b-baa2-901373162105","xrequestid":"f6e806c9-1a9a-446b-baa2-901373162105"}
+```
+
+## Store collected data in blob storage
+
+Data collection allows you to log production inference data to a Blob storage destination of your choice. The data destination settings are configurable at the `collection_name` level.
 
 __Blob storage output/format__:
 
@@ -356,7 +416,6 @@ The collected data follows the following JSON schema. The collected data is avai
 > [!TIP]
 > Line breaks are shown only for readability. In your collected .jsonl files, there won't be any line breaks.
 
-
 #### Store large payloads
 
 If the payload of your data is greater than 4 MB, there will be an event in the `{instance_id}.jsonl` file contained within the `{endpoint_name}/{deployment_name}/request/.../{instance_id}.jsonl` path that points to a raw file path, which should have the following path: `blob_url/{blob_container}/{blob_path}/{endpoint_name}/{deployment_name}/{rolled_time}/{instance_id}.jsonl`. The collected data will exist at this path.
@@ -419,65 +478,6 @@ To view the collected data in Blob Storage from the studio UI:
 
     :::image type="content" source="./media/how-to-collect-production-data/data-view.png" alt-text="Screenshot highlights tree structure of data in Datastore" lightbox="media/how-to-collect-production-data/data-view.png":::
 
-## Log payload
-
-In addition to custom logging with the provided Python SDK, you can collect request and response HTTP payload data directly without the need to augment your scoring script (`score.py`). 
-
-1. To enable payload logging, in your deployment YAML, use the names `request` and `response`:
-
-    ```yml
-    $schema: http://azureml/sdk-2-0/OnlineDeployment.json
-    
-    endpoint_name: my_endpoint 
-    name: blue 
-    model: azureml:my-model-m1:1 
-    environment: azureml:env-m1:1 
-    data_collector:
-       collections:
-           request:
-               enabled: 'True'
-           response:
-               enabled: 'True'
-    ```
-
-1. Deploy the model with payload logging enabled:
-
-    ```bash
-    $ az ml online-deployment create -f deployment.YAML
-    ```
-
-With payload logging, the collected data is not guaranteed to be in tabular format. Therefore, if you want to use collected payload data with model monitoring, you'll be required to provide a preprocessing component to make the data tabular. If you're interested in a seamless model monitoring experience, we recommend using the [custom logging Python SDK](#perform-custom-logging-for-model-monitoring).
-
-As your deployment is used, the collected data flows to your workspace Blob storage. The following JSON code is an example of an HTTP _request_ collected:
-
-```json
-{"specversion":"1.0",
-"id":"19790b87-a63c-4295-9a67-febb2d8fbce0",
-"source":"/subscriptions/d511f82f-71ba-49a4-8233-d7be8a3650f4/resourceGroups/mire2etesting/providers/Microsoft.MachineLearningServices/workspaces/mirmasterenvws/onlineEndpoints/localdev-endpoint/deployments/localdev",
-"type":"azureml.inference.request",
-"datacontenttype":"application/json",
-"time":"2022-05-25T08:59:48Z",
-"data":{"data": [  [1,2,3,4,5,6,7,8,9,10], [10,9,8,7,6,5,4,3,2,1]]},
-"path":"/score",
-"method":"POST",
-"contentrange":"bytes 0-59/*",
-"correlationid":"f6e806c9-1a9a-446b-baa2-901373162105","xrequestid":"f6e806c9-1a9a-446b-baa2-901373162105"}
-```
-
-And the following JSON code is another example of an HTTP _response_ collected:
-
-```json
-{"specversion":"1.0",
-"id":"bbd80e51-8855-455f-a719-970023f41e7d",
-"source":"/subscriptions/d511f82f-71ba-49a4-8233-d7be8a3650f4/resourceGroups/mire2etesting/providers/Microsoft.MachineLearningServices/workspaces/mirmasterenvws/onlineEndpoints/localdev-endpoint/deployments/localdev",
-"type":"azureml.inference.response",
-"datacontenttype":"application/json",
-"time":"2022-05-25T08:59:48Z",
-"data":[11055.977245525679, 4503.079536107787],
-"contentrange":"bytes 0-38/39",
-"correlationid":"f6e806c9-1a9a-446b-baa2-901373162105","xrequestid":"f6e806c9-1a9a-446b-baa2-901373162105"}
-```
-
 ## Collect data for MLflow models
 
 If you're deploying an MLflow model to an Azure Machine Learning online endpoint, you can enable production inference data collection with single toggle in the studio UI. If data collection is toggled on, Azure Machine Learning auto-instruments your scoring script with custom logging code to ensure that the production data is logged to your workspace Blob Storage. Your model monitors can then use the data to monitor the performance of your MLflow model in production.