You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/iot/iot-overview-scalability-high-availability.md
+56-13Lines changed: 56 additions & 13 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -3,25 +3,49 @@ title: IoT solution scalability and high availability
3
3
description: An overview of the scalability, high availability, and disaster recovery options for an IoT solution.
4
4
ms.service: azure-iot
5
5
services: iot
6
-
author: dominicbetts
7
-
ms.author: dobett
6
+
author: asergaz
7
+
ms.author: sergaz
8
8
ms.topic: overview
9
-
ms.date: 06/20/2024
10
-
ms.custom: template-overview
9
+
ms.date: 03/13/2025
11
10
# Customer intent: As a solution builder, I want a high-level overview of the options for scalability, high availability, and disaster recovery in an IoT solution so that I can easily find relevant content for my scenario.
12
11
---
13
12
14
13
# IoT solution scalability, high availability, and disaster recovery
15
14
16
15
This overview introduces the key concepts around the options for scalability, high availability, and disaster recovery in an Azure IoT solution. Each section includes links to content that provides further detail and guidance.
17
16
18
-
The following diagram shows a high-level view of the components in a typical IoT solution. This article focuses on the areas relevant to scalability, high availability, and disaster recover in an IoT solution.
17
+
# [Edge-based solution](#tab/edge)
19
18
20
-
:::image type="content" source="media/iot-overview-scalability-high-availability/iot-architecture.svg" alt-text="Diagram that shows the high-level IoT solution architecture highlighting solution extensibility areas." border="false" lightbox="media/iot-overview-scalability-high-availability/iot-architecture.svg":::
19
+
The following diagram shows a high-level view of the components in a typical edge-based IoT solution. This article focuses on the areas relevant to scalability, highavailability, and disaster recovery in an edge-based IoT solution:
21
20
22
-
## IoT solution scalability
21
+
<!-- Art Library Source# ConceptArt-0-000-032 -->
22
+
:::image type="content" source="media/iot-overview-scalability-high-availability/iot-edge-scalability-architecture.svg" alt-text="Diagram that shows the high-level IoT edge-based solution architecture highlighting scalability, high availability, and disaster recovery." border="false":::
23
23
24
-
An IoT solution might need to support millions of connected devices. You need to ensure that the components in your solution can scale to meet the demands.
24
+
# [Cloud-based solution](#tab/cloud)
25
+
26
+
The following diagram shows a high-level view of the components in a typical cloud-based IoT solution. This article focuses on the areas relevant to scalability, high availability, and disaster recovery in a cloud-based IoT solution:
27
+
28
+
<!-- Art Library Source# ConceptArt-0-000-032 -->
29
+
:::image type="content" source="media/iot-overview-scalability-high-availability/iot-cloud-scalability-architecture.svg" alt-text="Diagram that shows the high-level IoT cloud-based solution architecture highlighting scalability, high availability, and disaster recovery." border="false":::
30
+
31
+
---
32
+
33
+
## Scalability
34
+
35
+
An IoT solution might need to support millions of connected assets and devices. You need to ensure that the components in your solution can scale to meet the demands.
36
+
37
+
# [Edge-based solution](#tab/edge)
38
+
39
+
Deploy Azure IoT Operations on a multi-node cluster to ensure that you can handle increased traffic or workload demands. When Azure IoT Operations runs on a multi-node cluster, it can process more data and take advantage of the scalability and high-availability capabilities of Kubernetes.
40
+
41
+
You can horizontally scale the MQTT broker of Azure IoT Operations by adding more frontend replicas and backend partitions. The frontend replicas are responsible for accepting MQTT connections from clients and forwarding them to the backend partitions. The backend partitions are responsible for storing and delivering messages to the clients. The frontend pods distribute message traffic across the backend pods. The backend redundancy factor determines the number of data copies to provide resiliency against node failures in the cluster. To learn more, see [Configure broker settings for high availability, scaling, and memory usage](../iot-operations/manage-mqtt-broker/howto-configure-availability-scale.md).
42
+
43
+
Azure Device Registry is a backend service that enables the cloud and edge management of assets. Device Registry projects assets defined in your edge environment as Azure resources in the cloud. It provides a single unified registry so that all apps and services that interact with your assets can connect to a single source. Device Registry also manages the synchronization between assets in the cloud and assets as custom resources in Kubernetes on the edge, allowing you to scale your solution to millions of connected assets.
44
+
45
+
You can scale the data flow profile to adjust the number of instances that run the data flows. Increasing the instance count can improve the throughput of the data flows by creating multiple clients to process the data. When using data flows with cloud services that have rate limits per client, increasing the instance count can help you stay within the rate limits. Scaling can also improve the resiliency of the data flows by providing redundancy in case of failures. To learn more, see [Scaling data flow profiles](../iot-operations/connect-to-cloud/howto-configure-dataflow-profile.md).
46
+
47
+
48
+
# [Cloud-based solution](#tab/cloud)
25
49
26
50
Use the Device Provisioning Service (DPS) to provision devices at scale. DPS is a helper service for IoT Hub and IoT Central that enables zero-touch device provisioning at scale. To learn more, see [Best practices for large-scale IoT device deployments](../iot-dps/concepts-deploy-at-scale.md).
27
51
@@ -33,7 +57,7 @@ For a guide to scalability in an IoT Central solution, see [IoT Central scalabil
33
57
34
58
For devices that connect to an IoT hub directly or to an IoT hub in an IoT Central application, make sure that the devices continue to connect as your solution scales. To learn more, see [Manage device reconnections after autoscale](./concepts-manage-device-reconnections.md) and [Handle connection failures](../iot-central/core/concepts-device-implementation.md#best-practices).
35
59
36
-
IoT Edge can help to help scale your solution. IoT Edge lets you move cloud analytics and custom business logic from the cloud to your devices. This approach lets your cloud solution focus on business insights instead of data management. Scale out your IoT solution by packaging your business logic into standard containers, deploy those containers to your devices, and monitor them from the cloud. For more information, see [Azure IoT Edge](../iot-edge/about-iot-edge.md).
60
+
IoT Edge can help scale your solution. IoT Edge lets you move cloud analytics and custom business logic from the cloud to your devices. This approach lets your cloud solution focus on business insights instead of data management. Scale out your IoT solution by packaging your business logic into standard containers, deploy those containers to your devices, and monitor them from the cloud. For more information, see [Azure IoT Edge](../iot-edge/about-iot-edge.md).
-[IoT Hub Device Provisioning Service limits](../azure-resource-manager/management/azure-subscription-service-limits.md#azure-iot-hub-device-provisioning-service-limits)
50
74
75
+
---
76
+
77
+
51
78
## High availability and disaster recovery
52
79
53
80
IoT solutions are often business-critical. You need to ensure that your solution can continue to operate if a failure occurs. You also need to ensure that you can recover your solution following a disaster.
54
81
55
-
To learn more about the high availability and disaster recovery capabilities the IoT services in your solution, see the following articles:
82
+
# [Edge-based solution](#tab/edge)
83
+
84
+
Azure IoT Operations features an MQTT broker that's enterprise grade and compliant with standards. The MQTT broker is scalable, highly available, and Kubernetes-native. It provides the messaging plane for IoT Operations, enables bidirectional edge/cloud communication, and powers [event-driven applications](/azure/architecture/guide/architecture-styles/event-driven) at the edge. To ensure zero data loss and high availability during deployment upgrades, the MQTT broker implements rolling updates across the MQTT broker pods.
85
+
86
+
The state store is a distributed storage system, deployed as part of Azure IoT Operations. Using the state store, applications can get, set, and delete key-value pairs, without needing to install more services, such as Redis. The state store also provides versioning of the data, and also the primitives for building distributed locks, ideal for highly available applications. To learn more, see [Persisting data in the state store](../iot-operations/create-edge-apps/overview-state-store.md).
-[Azure Digital Twins](../digital-twins/concepts-high-availability-disaster-recovery.md)
88
+
On multi-node clusters with at least three nodes, you have the option of enabling fault tolerance for storage with [Azure Container Storage enabled by Azure Arc](/azure/azure-arc/container-storage/overview) when you deploy Azure IoT Operations.
89
+
90
+
[Dapr is offered as part of MQTT broker](../iot-operations/create-edge-apps/howto-develop-dapr-apps.md), abstracting away details of MQTT session management, message QoS and acknowledgment, and built-in key-value stores, making it a practical choice for developing a highly available application.
91
+
92
+
For information on high availability across availability zones and regions for Azure Device Registry, see [Reliability in Azure Device Registry](../reliability/reliability-device-registry.md).
93
+
94
+
# [Cloud-based solution](#tab/cloud)
95
+
96
+
To learn more about the high availability and disaster recovery capabilities of the cloud-based IoT services in your solution, see the following articles:
97
+
98
+
-[Azure IoT Hub high availability and disaster recovery](../iot-hub/iot-hub-ha-dr.md)
99
+
-[IoT Hub Device Provisioning Service high availability and disaster recovery](../iot-dps/iot-dps-ha-dr.md)
100
+
-[Azure Digital Twins high availability and disaster recovery](../digital-twins/concepts-high-availability-disaster-recovery.md)
60
101
-[Azure IoT Central high availability and disaster recovery](../iot-central/core/concepts-architecture.md#high-availability-and-disaster-recovery)
61
102
62
103
The following tutorials and guides provide more detail and guidance:
@@ -65,3 +106,5 @@ The following tutorials and guides provide more detail and guidance:
65
106
-[How to manually migrate an Azure IoT hub to a new Azure region](../iot-hub/migrate-hub-arm.md)
66
107
-[Manage device reconnections to create resilient applications (IoT Hub and IoT Central)](./concepts-manage-device-reconnections.md)
67
108
-[IoT Central device best practices](../iot-central/core/concepts-device-implementation.md#best-practices)
0 commit comments