Move processor shortcode content inline for batch 2

maycmlee · maycmlee · commit 9d01ec25e58a · 2026-02-13T17:01:54.000-05:00
Move shortcode content into processor docs and restructure into Overview and Setup sections for:
- parse_json
- parse_xml
- quota
- reduce
- remap_ocsf
diff --git a/content/en/observability_pipelines/processors/parse_json.md b/content/en/observability_pipelines/processors/parse_json.md
@@ -15,7 +15,55 @@ further_reading:
 
 {{< product-availability >}}
 
-{{% observability_pipelines/processors/parse_json %}}
+## Overview
+
+This processor parses the specified JSON field into objects. For example, if you have a `message` field that contains stringified JSON:
+
+```json
+{
+    "foo": "bar",
+    "team": "my-team",
+    "message": "{\"level\":\"info\",\"timestamp\":\"2024-01-15T10:30:00Z\",\"service\":\"user-service\",\"user_id\":\"12345\",\"action\":\"login\",\"success\":true,\"ip_address\":\"192.168.1.100\"}"
+    "app_id":"streaming-services",
+    "ddtags": [
+    "kube_service:my-service",
+    "k8_deployment :your-host"
+    ]
+}
+```
+
+Use the Parse JSON processor to parse the `message` field so the `message` field has all the attributes within a nested object.
+
+{{< img src="observability_pipelines/processors/parse-json-example.png" alt="The parse json processor with message as the field to parse on" style="width:60%;" >}}
+
+This output contains the `message` field with the parsed JSON:
+
+```json
+{
+    "foo": "bar",
+    "team": "my-team",
+    "message": {
+        "action": "login",
+        "ip_address": "192.168.1.100",
+        "level": "info",
+        "service": "user-service",
+        "success": true,
+        "timestamp": "2024-01-15T10:30:00Z",
+        "user_id": "12345"
+    }
+    "app_id":"streaming-services",
+    "ddtags": [
+    "kube_service:my-service",
+    "k8_deployment :your-host"
+    ]
+}
+```
+
+## Setup
+
+To set up this processor:
+1. Define a **filter query**. Only logs that match the specified [filter query](#filter-query-syntax) are processed. All logs, regardless of whether they do or do not match the filter query, are sent to the next step in the pipeline.
+2. Enter the name of the field you want to parse JSON on.<br>**Note**: The parsed JSON overwrites what was originally contained in the field.
 
 {{% observability_pipelines/processors/filter_syntax %}}
 
diff --git a/content/en/observability_pipelines/processors/parse_xml.md b/content/en/observability_pipelines/processors/parse_xml.md
@@ -16,7 +16,127 @@ products:
 
 {{< product-availability >}}
 
-{{% observability_pipelines/processors/parse_xml %}}
+## Overview
+
+This processor parses Extensible Markup Language (XML) so the data can be processed and sent to different destinations. XML is a log format used to store and transport structured data. It is organized in a tree-like structure to represent nested information and uses tags and attributes to define the data. For example, this is XML data using only tags (`<recipe>`,`<type>`, and `<name>`) and no attributes:
+
+```xml
+<recipe>
+    <type>pasta</type>
+    <name>Carbonara</name>
+</recipe>
+```
+
+This is an XML example where the tag `recipe` has the attribute `type`:
+
+```xml
+<recipe>
+    <recipe type="pasta">
+    <name>Carbonara</name>
+</recipe>
+```
+
+The following image shows a Windows Event 4625 log in XML, next to the same log parsed and output in JSON. By parsing the XML log, the size of the log event was reduced by approximately 30%.
+
+{{< img src="observability_pipelines/processors/xml-side-by-side.png" alt="The XML log and the resulting parsed log in JSON" style="width:80%;" >}}
+
+## Setup
+
+To set up this processor:
+
+1. Define a filter query. Only logs that match the specified filter query are processed. All logs, regardless of whether they match the filter query, are sent to the next step in the pipeline.
+1. Enter the path to the log field on which you want to parse XML. Use the path notation `<OUTER_FIELD>.<INNER_FIELD>` to match subfields. See the [Path notation example](#path-notation-example-parse-xml) below.
+1. Optionally, in the `Enter text key` field, input the key name to use for the text node when XML attributes are appended. See the [text key example](#text-key-example). If the field is left empty, `value` is used as the key name.
+1. Optionally, select `Always use text key` if you want to store text inside an object using the text key even when no attributes exist.
+1. Optionally, toggle `Include XML attributes` on if you want to include XML attributes. You can then choose to add the attribute prefix you want to use. See [attribute prefix example](#attribute-prefix-example). If the field is left empty, the original attribute key is used.
+1. Optionally, select if you want to convert data types into numbers, Booleans, or nulls.
+    - If **Numbers** is selected, numbers are parsed as integers and floats.
+    - If **Booleans** is selected, `true` and `false` are parsed as Booleans.
+    - If **Nulls** is selected, the string `null` is parsed as null.
+
+##### Path notation example {#path-notation-example-parse-xml}
+
+For the following message structure:
+
+```json
+{
+    "outer_key": {
+        "inner_key": "inner_value",
+        "a": {
+            "double_inner_key": "double_inner_value",
+            "b": "b value"
+        },
+        "c": "c value"
+    },
+    "d": "d value"
+}
+```
+
+- Use `outer_key.inner_key` to see the key with the value `inner_value`.
+- Use `outer_key.inner_key.double_inner_key` to see the key with the value `double_inner_value`.
+
+##### Always use text key example
+
+If **Always use text key** is selected, the text key is the default (`value`), and you have the following XML:
+
+```xml
+<recipe>
+    <recipe type="pasta">
+    <name>Carbonara</name>
+</recipe>
+```
+
+The XML is converted to:
+
+```json
+{
+    "recipe": {
+        "type": "pasta",
+        "value": "Carbonara"
+        }
+}
+```
+
+##### Text key example
+
+If the key is `text` and you have the following XML:
+
+```xml
+<recipe>
+    <recipe type="pasta">
+    <name>Carbonara</name>
+</recipe>
+```
+
+The XML is converted to:
+
+```json
+{
+    "recipe": {
+        "type": "pasta",
+        "text": "Carbonara"
+        }
+}
+```
+
+##### Attribute prefix example
+
+If you enable **Include XML attributes**, the attribute is added as a prefix to each XML attribute. For example, if the attribute prefix is `@` and you have the following XML:
+
+```xml
+<recipe type="pasta">Carbonara</recipe>
+```
+
+Then it is converted to the JSON:
+
+```json
+{
+    "recipe": {
+        "@type": "pasta",
+        "<text key>": "Carbonara"
+        }
+}
+```
 
 {{% observability_pipelines/processors/filter_syntax %}}
 
diff --git a/content/en/observability_pipelines/processors/quota.md b/content/en/observability_pipelines/processors/quota.md
@@ -9,6 +9,83 @@ products:
 
 {{< product-availability >}}
 
-{{% observability_pipelines/processors/quota %}}
+## Overview
+
+The quota processor measures the logging traffic for logs that match the filter you specify. When the configured daily quota is met inside the 24-hour rolling window, the processor can either keep or drop additional logs, or send them to a storage bucket. For example, you can configure this processor to drop new logs or trigger an alert without dropping logs after the processor has received 10 million events from a certain service in the last 24 hours.
+
+You can also use field-based partitioning, such as `service`, `env`, `status`. Each unique fields uses a separate quota bucket with its own daily quota limit. See [Partition example](#partition-example) for more information.
+
+**Note**: The pipeline uses the name of the quota to identify the same quota across multiple Remote Configuration deployments of the Worker.
+
+### Limits
+
+- Each pipeline can have up to 1000 buckets. If you need to increase the bucket limit, [contact support][5].
+- The quota processor is synchronized across all Workers in a Datadog organization. For the synchronization, there is a default rate limit of 50 Workers per organization. When there are more than 50 Workers for an organization:
+    - The processor continues to run, but does not sync correctly with the other Workers, which can result in logs being sent after the quota limit has been reached.
+    - The Worker prints `Failed to sync quota state` errors.
+    - [Contact support][5] if you want to increase the default number of Workers per organization.
+- The quota processor periodically synchronizes counts across Workers a few times per minute. The limit set on the processor can therefore be overshot, depending on the number of Workers and the logs throughput. Datadog recommends setting a limit that is at least one order of magnitude higher than the volume of logs that the processor is expected to receive per minute. You can use a throttle processor with the quota processor to control these short bursts by limiting the number of logs allowed per minute.
+
+## Setup
+
+To set up the quota processor:
+1. Enter a name for the quota processor.
+1. Define a **filter query**. Only logs that match the specified [filter query](#filter-query-syntax) are counted towards the daily limit.
+    - Logs that match the quota filter and are within the daily quota are sent to the next step in the pipeline.
+    - Logs that do not match the quota filter are sent to the next step of the pipeline.
+1. In the **Unit for quota** dropdown menu, select if you want to measure the quota by the number of `Events` or by the `Volume` in bytes.
+1. Set the daily quota limit and select the unit of magnitude for your desired quota.
+1. Optional: Click **Add Field** if you want to set a quota on a specific service or region field.
+   1. Enter the field name you want to partition by. See the [Partition example](#partition-example) for more information.
+      1. Select the **Ignore when missing** if you want the quota applied only to events that match the partition. See the [Ignore when missing example](#example-for-the-ignore-when-missing-option) for more information.
+      1. Optional: Click **Overrides** if you want to set different quotas for the partitioned field.
+         - Click **Download as CSV** for an example of how to structure the CSV.
+         - Drag and drop your overrides CSV to upload it. You can also click **Browse** to select the file to upload it. See the [Overrides example](#overrides-example) for more information.
+   1. Click **Add Field** if you want to add another partition.
+1. In the **When quota is met** dropdown menu, select if you want to **drop events**, **keep events**, or **send events to overflow destination**, when the quota has been met.
+   1. If you select **send events to overflow destination**, an overflow destination is added with the following cloud storage options: **Amazon S3**, **Azure Blob**, and **Google Cloud**.
+   1. Select the cloud storage you want to send overflow logs to. See the setup instructions for your cloud storage: [Amazon S3][2], [Azure Blob Storage][3], or [Google Cloud Storage][4].
+
+#### Examples
+
+##### Partition example
+
+Use **Partition by** if you want to set a quota on a specific service or region. For example, if you want to set a quota for 10 events per day and group the events by the `service` field, enter `service` into the **Partition by** field.
+
+##### Example for the "ignore when missing" option
+
+Select **Ignore when missing** if you want the quota applied only to events that match the partition. For example, if the Worker receives the following set of events:
+
+```
+{"service":"a", "source":"foo", "message": "..."}
+{"service":"b", "source":"bar", "message": "..."}
+{"service":"b", "message": "..."}
+{"source":"redis", "message": "..."}
+{"message": "..."}
+```
+
+And the **Ignore when missing** is selected, then the Worker:
+- creates a set for logs with `service:a` and `source:foo`
+- creates a set for logs with `service:b` and `source:bar`
+- ignores the last three events
+
+The quota is applied to the two sets of logs and not to the last three events.
+
+If the **Ignore when missing** is not selected, the quota is applied to all five events.
+
+##### Overrides example
+
+If you are partitioning by `service` and have two services: `a` and `b`, you can use overrides to apply different quotas for them. For example, if you want `service:a` to have a quota limit of 5,000 bytes and `service:b` to have a limit of 50 events, the override rules look like this:
+
+| Service | Type   | Limit |
+| ------- | ------ | ----- |
+|  `a`    | Bytes  | 5,000 |
+|  `b`    | Events | 50    |
+
+[1]: /monitors/types/metric/?tab=threshold
+[2]: /observability_pipelines/destinations/amazon_s3/
+[3]: /observability_pipelines/destinations/azure_storage/
+[4]: /observability_pipelines/destinations/google_cloud_storage/
+[5]: /help/
 
 {{% observability_pipelines/processors/filter_syntax %}}
diff --git a/content/en/observability_pipelines/processors/reduce.md b/content/en/observability_pipelines/processors/reduce.md
@@ -9,6 +9,39 @@ products:
 
 {{< product-availability >}}
 
-{{% observability_pipelines/processors/reduce %}}
+## Overview
+
+The reduce processor groups multiple log events into a single log, based on the fields specified and the merge strategies selected. Logs are grouped at 10-second intervals. After the interval has elapsed for the group, the reduced log for that group is sent to the next step in the pipeline.
+
+## Setup
+
+To set up the reduce processor:
+1. Define a **filter query**. Only logs that match the specified [filter query](#filter-query-syntax) are processed. Reduced logs and logs that do not match the filter query are sent to the next step in the pipeline.
+2. In the **Group By** section, enter the field you want to group the logs by.
+3. Click **Add Group by Field** to add additional fields.
+4. In the **Merge Strategy** section:
+   - In **On Field**, enter the name of the field you want to merge the logs on.
+   - Select the merge strategy in the **Apply** dropdown menu. This is the strategy used to combine events. See the following [Merge strategies](#merge-strategies) section for descriptions of the available strategies.
+   - Click **Add Merge Strategy** to add additional strategies.
+
+##### Merge strategies
+
+These are the available merge strategies for combining log events.
+
+
+| Name           | Description                                                                                                        |
+| -------------- | ------------------------------------------------------------------------------------------------------------------ |
+| Array          | Appends each value to an array.                                                                                    |
+| Concat         | Concatenates each string value, delimited with a space.                                                            |
+| Concat newline | Concatenates each string value, delimited with a newline.                                                          |
+| Concat raw     | Concatenates each string value, without a delimiter.                                                               |
+| Discard        | Discards all values except the first value that was received.                                                      |
+| Flat unique    | Creates a flattened array of all unique values that were received.                                                 |
+| Longest array  | Keeps the longest array that was received.                                                                         |
+| Max            | Keeps the maximum numeric value that was received.                                                                 |
+| Min            | Keeps the minimum numeric value that was received.                                                                 |
+| Retain         | Discards all values except the last value that was received. Works as a way to coalesce by not retaining \`null\`. |
+| Shortest array | Keeps the shortest array that was received.                                                                        |
+| Sum            | Sums all numeric values that were received.                                                                        |
 
 {{% observability_pipelines/processors/filter_syntax %}}
diff --git a/content/en/observability_pipelines/processors/remap_ocsf.md b/content/en/observability_pipelines/processors/remap_ocsf.md