You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/iot-operations/manage-mqtt-broker/howto-broker-diagnostics.md
+5-7Lines changed: 5 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -16,35 +16,33 @@ ms.date: 11/14/2024
16
16
> [!IMPORTANT]
17
17
> This setting requires modifying the Broker resource and can only be configured at initial deployment time using the Azure CLI or Azure Portal. A new deployment is required if Broker configuration changes are needed. To learn more, see [Customize default Broker](./overview-broker.md#customize-default-broker).
18
18
19
-
Diagnostic settings allow you to configure metrics, tracing, logging, and self-check for the MQTT broker.
19
+
Diagnostic settings allow you to configure metrics, logs, and self-check for the MQTT broker.
20
20
21
21
## Metrics
22
22
23
23
Metrics provide information about the current and past health and status of the MQTT broker. These metrics are emitted in OpenTelemetry format (OTLP). They can be converted to Prometheus format using an OpenTelemetry Collector and routed to Azure Managed Grafana Dashboards using Azure Monitor Managed Service for Prometheus. To learn more, see [Configure observability and monitoring](../configure-observability-monitoring/howto-configure-observability.md).
24
24
25
25
For the full list of metrics available, see [MQTT broker metrics](../reference/observability-metrics-mqtt-broker.md).
26
26
27
-
## Logs and traces
27
+
## Logs
28
28
29
29
Logs provide information about the operations performed by MQTT broker. These logs are available in the Kubernetes cluster as container logs. They can be configured to be sent to Azure Monitor Logs with Container Insights.
30
30
31
-
Traces are for [distributed tracing](https://opentelemetry.io/docs/concepts/signals/traces/) and provide detailed information about the requests and responses handled by MQTT broker. These traces can be sent to Azure Monitor through OpenTelemetry Collector.
32
-
33
31
To learn more, see [Configure observability and monitoring](../configure-observability-monitoring/howto-configure-observability.md).
34
32
35
33
## Self-check
36
34
37
-
The MQTT Broker's self-check mechanism is enabled by default. It uses the Diagnostics Probe and OpenTelemetry (OTel) traces to monitor the broker. The probe sends test messages to check the system's behavior and timing.
35
+
The MQTT broker's self-check mechanism is enabled by default. It uses a diagnostics probe and OpenTelemetry (OTel) traces to monitor the broker. The probe sends test messages to check the system's behavior and timing.
38
36
39
37
The validation process checks if the system works correctly by comparing the test results with expected outcomes. These outcomes include:
40
38
41
39
1. The paths messages take through the system.
42
40
2. The system's timing behavior.
43
41
44
-
The Diagnostics Probe periodically executes MQTT operations (PING, CONNECT, PUBLISH, SUBSCRIBE, UNSUBSCRIBE) on the AIO Broker and monitors the corresponding ACKs and traces to check for latency, message loss, and correctness of the replication protocol.
42
+
The diagnostics probe periodically executes MQTT operations (PING, CONNECT, PUBLISH, SUBSCRIBE, UNSUBSCRIBE) on the MQTT broker and monitors the corresponding ACKs and traces to check for latency, message loss, and correctness of the replication protocol.
45
43
46
44
> [!IMPORTANT]
47
-
> The self-check Diagnostics Probe publishes messages to the `azedge/dmqtt/selftest` topic. Don't publish or subscribe to diagnostic probe topics that start with `azedge/dmqtt/selftest`. Publishing or subscribing to these topics might affect the probe or self-test checks resulting in invalid results. Invalid results might be listed in diagnostic probe logs, metrics, or dashboards. For example, you might see the issue *Path verification failed for probe event with operation type 'Publish'* in the diagnostics-probe logs. For more information, see [Known Issues](../troubleshoot/known-issues.md#mqtt-broker).
45
+
> The self-check diagnostics probe publishes messages to the `azedge/dmqtt/selftest` topic. Don't publish or subscribe to diagnostic probe topics that start with `azedge/dmqtt/selftest`. Publishing or subscribing to these topics might affect the probe or self-test checks resulting in invalid results. Invalid results might be listed in diagnostic probe logs, metrics, or dashboards. For example, you might see the issue *Path verification failed for probe event with operation type 'Publish'* in the diagnostics-probe logs. For more information, see [Known Issues](../troubleshoot/known-issues.md#mqtt-broker).
Copy file name to clipboardExpand all lines: articles/iot-operations/reference/observability-metrics-mqtt-broker.md
+39-4Lines changed: 39 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,7 +6,7 @@ ms.author: kgremban
6
6
ms.topic: reference
7
7
ms.custom:
8
8
- ignite-2023
9
-
ms.date: 11/14/2024
9
+
ms.date: 11/15/2024
10
10
11
11
# CustomerIntent: As an IT admin or operator, I want to be able to monitor and visualize data
12
12
# on the health of my industrial assets and edge environment.
@@ -26,7 +26,7 @@ For example, if the self-check probe connects with `metriccategory=broker_selfte
26
26
27
27
This feature helps dashboards show traffic sources without the high cardinality issues of tagging metrics with topics.
28
28
29
-
Sessions without a `metriccategory` are tagged as "category=uncategorized".
29
+
Sessions without a `metriccategory` are tagged as "category=uncategorized."
30
30
31
31
## Messaging metrics
32
32
@@ -49,12 +49,47 @@ All metrics include the `hostname` tag to identify the pod that generated the me
49
49
| aio_broker_store_retained_bytes | This metric counts how many bytes are stored via retained messages on the broker. ||
50
50
| aio_broker_store_will_messages | This metric counts how many will messages are stored on the broker. ||
51
51
| aio_broker_store_will_bytes | This metric counts how many bytes are stored via will messages on the broker. ||
52
+
| aio_broker_number_of_routes | Counts number of routes. ||
53
+
| aio_broker_connect_route_replication_correctness | Describes if a connection request from a self test client is replicated correctly along a specific route. ||
54
+
| aio_broker_connect_latency_route_ms | Describes the time interval between a self test client sending a CONNECT packet and receiving a CONNACK packet. This metric is generated per route. The metric is generated only if a CONNECT is successful. ||
55
+
| aio_broker_connect_latency_last_value_ms | An estimated p99 of `connect_latency_route_ms`. ||
56
+
| aio_broker_connect_latency_mu_ms | The mean value of `connect_latency_route_ms`. ||
57
+
| aio_broker_connect_latency_sigma_ms | The standard deviation of `connect_latency_route_ms`. ||
58
+
| aio_broker_subscribe_route_replication_correctness | Describes if a subscribe request from a self test client is replicated correctly along a specific route. ||
59
+
| aio_broker_subscribe_latency_route_ms | Describes time interval between a self test client sending a SUBSCRIBE packet and receiving a SUBACK packet. This metric is generated per route. The metric is generated only if a SUBSCRIBE is successful. ||
60
+
| aio_broker_subscribe_latency_last_value_ms | An estimated p99 of `subscribe_latency_route_ms`. ||
61
+
| aio_broker_subscribe_latency_mu_ms | The mean value of `subscribe_latency_route_ms`. ||
62
+
| aio_broker_subscribe_latency_sigma_ms | The standard deviation of `subscribe_latency_route_ms`. ||
63
+
| aio_broker_unsubscribe_route_replication_correctness | Describes if an unsubscribe request from a self test client is replicated correctly along a specific route. ||
64
+
| aio_broker_unsubscribe_latency_route_ms | Describes the time interval between a self test client sending a UNSUBSCRIBE packet and receiving a UNSUBACK packet. This metric is generated per route. The metric is generated only if a UNSUBSCRIBE is successful. ||
65
+
| aio_broker_unsubscribe_latency_last_value_ms | An estimated p99 of `unsubscribe_latency_route_ms`. ||
66
+
| aio_broker_unsubscribe_latency_mu_ms | The mean value of `unsubscribe_latency_route_ms`. ||
67
+
| aio_broker_unsubscribe_latency_sigma_ms | The standard deviation of `subscribe_latency_route_ms`. ||
68
+
| aio_broker_publish_route_replication_correctness | Describes if an unsubscribe request from a self test client is replicated correctly along a specific route. ||
69
+
| aio_broker_publish_latency_route_ms | Describes the time interval between a self test client sending a PUBLISH packet and receiving a PUBACK packet. This metric is generated per route. The metric is generated only if a PUBLISH is successful. ||
70
+
| aio_broker_publish_latency_last_value_ms | An estimated p99 of `publish_latency_route_ms`. ||
71
+
| aio_broker_publish_latency_mu_ms | The mean value of `publish_latency_route_ms`. ||
72
+
| aio_broker_publish_latency_sigma_ms | The standard deviation of `publish_latency_route_ms`. ||
73
+
| aio_broker_payload_check_latency_last_value_ms | An estimated p99 of latency check of the last value. ||
74
+
| aio_broker_payload_check_latency_mu_ms | The mean value of latency check. ||
75
+
| aio_broker_payload_check_latency_sigma_ms | The standard deviation of latency of the payload. ||
76
+
| aio_broker_payload_check_total_messages_lost | The count of payload total message lost. ||
77
+
| aio_broker_payload_check_total_messages_received | The count of total number of messages received. ||
78
+
| aio_broker_payload_check_total_messages_sent | The count of total number of messages sent. ||
79
+
| aio_broker_ping_correctness | Describes whether the ping from self-test client works correctly. ||
80
+
| aio_broker_ping_latency_last_value_ms | An estimated p99 of ping operation of the last value. ||
81
+
| aio_broker_ping_latency_mu_ms | The mean value of ping check. ||
82
+
| aio_broker_ping_latency_route_ms | The ping latency in milliseconds for a specific route. ||
83
+
| aio_broker_ping_latency_sigma_ms | The standard deviation of latency of the ping operation. ||
84
+
| aio_broker_publishes_processed_count | Describes the processed counts of message published. ||
85
+
| aio_broker_publishes_received_per_second | Counts the number of published messages received per second. ||
86
+
| aio_broker_publishes_sent_per_second | Counts the number of sent messages received per second. ||
52
87
53
88
## Broker operator health metrics
54
89
55
90
This set of metrics tracks the [cardinality state of the broker](../manage-mqtt-broker/howto-configure-availability-scale.md). Each desired metric is paired with a reported metric to show the current state. These metrics indicate the number of healthy pods from the broker's perspective, which might differ from Kubernetes' reports.
56
91
57
-
For example, if a backend node restarts but doesn't reconnect to its chain, there can be a discrepancy in health reports. Kubernetes might report the pod as healthy, while the broker reports it as down because it is not functioning properly.
92
+
For example, if a backend node restarts but doesn't reconnect to its chain, there can be a discrepancy in health reports. Kubernetes might report the pod as healthy, while the broker reports it as down because it isn't functioning properly.
58
93
59
94
| Desired Metric | Reported Metric |
60
95
|----------------|-----------------|
@@ -88,7 +123,7 @@ This set of metrics tracks the overall state of the [state store](../create-edge
88
123
| aio_broker_state_store_insertions | This metric counts the number of new key insert requests received, including both successful insertions and errors. ||
89
124
| aio_broker_state_store_keynotify_requests | This metric counts the number of requests to monitor key changes (KEYNOTIFY) received, including both successful modifications and errors. ||
90
125
| aio_broker_state_store_modifications | This metric counts the number of modify key requests received, including both successful modifications and errors. ||
91
-
| aio_broker_state_store_notifications_sent | This metric counts the number of notification messages the state store sends when a key value changes and a client is registered via KEYNOTIFY. ||
126
+
| aio_broker_state_store_notifications_sent | This metric counts the number of notification messages the state store sends when a key value changes and a client are registered via KEYNOTIFY. ||
92
127
| aio_broker_state_store_retrievals | This metric counts the number of key value retrieval requests received, including both successful retrievals and errors. ||
0 commit comments