You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/iot-dps/concepts-deploy-at-scale.md
+38-38Lines changed: 38 additions & 38 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -11,7 +11,7 @@ ms.custom: template-concept
11
11
12
12
# Best practices for large-scale IoT device deployments
13
13
14
-
Scaling an IoT solution to millions of devices can be challenging. Large-scale solutions often need to be designed in accordance with service and subscription limits. When customers use Azure IoT Device Provisioning Service, they use it in combination with other Azure IoT platform services and components, such as IoT Hub and many times with an Azure IoT device SDK. This article describes best practices, patterns, and sample code you can incorporate in your design to take advantage of these services and allow your deployments to scale out. By following these simple patterns and practices right from the design phase of the project, you can maximize the performance of your IoT devices.
14
+
Scaling an IoT solution to millions of devices can be challenging. Large-scale solutions often need to be designed in accordance with service and subscription limits. When customers use Azure IoT Device Provisioning Service, they use it in combination with other Azure IoT platform services and components, such as IoT Hub and Azure IoT device SDKs. This article describes best practices, patterns, and sample code you can incorporate in your design to take advantage of these services and allow your deployments to scale out. By following these simple patterns and practices right from the design phase of the project, you can maximize the performance of your IoT devices.
15
15
16
16
## First-time device provisioning
17
17
@@ -37,53 +37,30 @@ Where `<load>` is a configurable factor with values > 0 (indicates that the loa
37
37
38
38
For more information on the timing of retry operations, see [Retry timing](https://github.com/Azure/azure-sdk-for-c/blob/main/sdk/docs/iot/mqtt_state_machine.md#retry-timing).
39
39
40
-
## Hub connectivity considerations when using DPS
41
-
42
-
- If you plan to have more than a million devices, the recommended path to scaling is to cap the number of devices to 1 million per hub and add hubs as needed when increasing the scale of your deployment.
43
-
- If you have plans for more than a million devices and you need to support them in a specific region (such as in an EU region for data residency requirements), you can [contact us](../iot-fundamentals/iot-support-help.md) to ensure that the region you're deploying to has the capacity to support your current and future scale.
44
-
45
-
Recommended device logic when connecting to IoT Hub via DPS:
46
-
47
-
- On first boot, devices should go use the [DPS registration API](/rest/api/iot-dps/device/runtime-registration/register-device) to register.
48
-
- On subsequent boots, devices should:
49
-
- If possible, cache their provisioning details and connect using this information from this cache.
50
-
- If they can't cache IoT hub connection information, use the [Device Registration Status Lookup API](/rest/api/iot-dps/device/runtime-registration/device-registration-status-lookup) to return connection information once registration has been done. This API call is a much lighter weight operation for DPS than a full device registration operation.
51
-
- For devices in either case described above, devices should use the following logic in response to error codes when connecting:
52
-
- When receiving any of the 500-series of server error responses, retry the connection using either cached credentials or the results of a Device Registration Status Lookup API call.
53
-
- When receiving `401, Unauthorized` or `403, Forbidden` or `404, Not Found`, perform a full re-registration by calling the [DPS registration API](/rest/api/iot-dps/device/runtime-registration/register-device).
54
-
- At any time, devices should be capable of responding to a user-initiated reprovisioning command.
55
-
56
-
Other IoT Hub scenarios when using DPS:
57
-
58
-
- IoT Hub failover: Devices should continue to work as connection information shouldn't change and logic is in place to retry the connection once the hub is available again.
59
-
- Change of IoT Hub: Assigning devices to a different IoT Hub should be done by using a [custom allocation policy](tutorial-custom-allocation-policies.md).
60
-
- Retry IoT Hub connection: You shouldn't use an aggressive retry strategy, instead allowing a gap of at least a minute before a retry.
61
-
- IoT Hub partitions: If your device strategy leans heavily on telemetry, the number of device-to-cloud partitions should be increased.
62
-
63
40
## Reprovisioning devices
64
41
65
42
Reprovisioning is the process where the device needs to be provisioned to an IoT Hub after having been successfully connected previously. There can be many reasons that result in a need for device to reconnect to an IoT Hub, such as:
66
43
67
-
- A device reboot could happen due to reasons like power outage, loss in network connectivity, geo-relocation, firmware updates, factory reset, and certificate key rotation.
68
-
- The Hub instance could be unavailable due to an unplanned Hub outage.
44
+
- A device could reboot due to power outage, loss in network connectivity, geo-relocation, firmware updates, factory reset, or certificate key rotation.
45
+
- The IoT Hub instance could be unavailable due to an unplanned IoT Hub outage.
69
46
70
-
You shouldn't need to provision every time the device reboots. Upon successful reboot and provisioning, the device would be connected to the same IoT Hub in most scenarios and so fresh provisioning isn't necessary. The information about the IoT Hub that has been cached from a previous successful connection must be used to directly connect to the hub as opposed to going through the extensive reprovisioning process.
47
+
You shouldn't need to provision every time the device reboots. Most devices that are reprovisioned end up connected to the same IoT hub in most scenarios. Instead, the device should attempt to directly connect to its IoT hub using the information that was cached from a previous successful connection.
71
48
72
49
### Devices that can store a connection string
73
50
74
-
If the devices have the ability to store the connection string to the previously provisioned and connected hub, use the same string to skip the entire reprovisioning process and directly connect to the hub. This reduces the latency in successfully connecting to the appropriate hub. There are two possible cases here:
51
+
If the devices have the ability to store the connection string to the previously provisioned and connected IoT Hub, use the same string to skip the entire reprovisioning process and directly connect to the IoT Hub. This reduces the latency in successfully connecting to the appropriate IoT Hub. There are two possible cases here:
75
52
76
-
- The IoT Hub to connect upon device reboot is the same as the previously connected hub.
53
+
- The IoT Hub to connect upon device reboot is the same as the previously connected IoT Hub.
77
54
78
55
The connection string retrieved from the cache should work fine and the device must attempt to reconnect to the same endpoint. No need for a fresh start for the provisioning process.
79
56
80
-
- The IoT Hub to connect upon device reboot is different from the previously connected hub.
57
+
- The IoT Hub to connect upon device reboot is different from the previously connected IoT Hub.
81
58
82
-
The connection string stored in memory is inaccurate, attempting to connect to the same endpoint won't be successful, and then the retry mechanism for the Hub connection is triggered. Once the threshold for the hub connection failure is reached, the retry mechanism automatically triggers a fresh start to the provisioning process.
59
+
The connection string stored in memory is inaccurate. Attempting to connect to the same endpoint won't be successful and so the retry mechanism for the IoT Hub connection is triggered. Once the threshold for the IoT Hub connection failure is reached, the retry mechanism automatically triggers a fresh start to the provisioning process.
83
60
84
61
### Devices that can't store a connection string
85
62
86
-
In certain scenarios, devices don't have a large enough footprint or memory to accommodate caching of the connection string from a past successful IoT Hub connection. You can use the [Device Registration Status Lookup API](/rest/api/iot-dps/device/runtime-registration/device-registration-status-lookup) to retrieve the connection string from the previous time the device was provisioned and then attempt a connection to that IoT Hub. At every device reboot, that API needs to be invoked to get the device registration status. If data related to a previously connected hub was returned by the API call, you can connect to the same hub. If the API returns a null payload, then there's no previous connection available and the reprovisioning process through DPS is automatically triggered.
63
+
In certain scenarios, devices don't have a large enough footprint or memory to accommodate caching of the connection string from a past successful IoT Hub connection. You can use the [Device Registration Status Lookup API](/rest/api/iot-dps/device/runtime-registration/device-registration-status-lookup) to retrieve the connection string from the previous time the device was provisioned and then attempt a connection to that IoT Hub. At every device reboot, that API needs to be invoked to get the device registration status. If data related to a previously connected IoT Hub was returned by the API call, you can connect to the same IoT Hub. If the API returns a null payload, then there's no previous connection available and the reprovisioning process through DPS is automatically triggered.
// If there was Hub info from previous provisioning in the cache, try connecting to the hub directly
160
-
// If trying to connect to the Hub returns status 429, make sure to retry operation honoring
136
+
// If there was IoT Hub info from previous provisioning in the cache, try connecting to the IoT Hub directly
137
+
// If trying to connect to the IoT Hub returns status 429, make sure to retry operation honoring
161
138
// the retry-after header
162
-
// If trying to connect to the Hub returns a 500-series server error, have an exponential backoff with
139
+
// If trying to connect to the IoT Hub returns a 500-series server error, have an exponential backoff with
163
140
// at least 5 seconds of wait-time
164
141
// For all response codes 429 and 5xx, reprovision through DPS
165
142
// Ideally, you should also support a method to manually trigger provisioning on demand
@@ -188,14 +165,37 @@ if (provisioningDetails != null)
188
165
}
189
166
```
190
167
168
+
## IoT Hub connectivity considerations
169
+
170
+
- Any single IoT hub is limited to 1 million devices plus modules. If you plan to have more than a million devices, cap the number of devices to 1 million per hub and add hubs as needed when increasing the scale of your deployment. For more information, see [IoT Hub quotas](../iot-hub/iot-hub-devguide-quotas-and-throttling.md).
171
+
- If you have plans for more than a million devices and you need to support them in a specific region (such as in an EU region for data residency requirements), you can [contact us](../iot-fundamentals/iot-support-help.md) to ensure that the region you're deploying to has the capacity to support your current and future scale.
172
+
173
+
Recommended device logic when connecting to IoT Hub via DPS:
174
+
175
+
- On first boot, devices should go use the [DPS registration API](/rest/api/iot-dps/device/runtime-registration/register-device) to register.
176
+
- On subsequent boots, devices should:
177
+
- If possible, cache their provisioning details and connect using this information from this cache.
178
+
- If they can't cache IoT hub connection information, use the [Device Registration Status Lookup API](/rest/api/iot-dps/device/runtime-registration/device-registration-status-lookup) to return connection information once registration has been done. This API call is a much lighter weight operation for DPS than a full device registration operation.
179
+
- For devices in either case described above, devices should use the following logic in response to error codes when connecting:
180
+
- When receiving any of the 500-series of server error responses, retry the connection using either cached credentials or the results of a Device Registration Status Lookup API call.
181
+
- When receiving `401, Unauthorized` or `403, Forbidden` or `404, Not Found`, perform a full re-registration by calling the [DPS registration API](/rest/api/iot-dps/device/runtime-registration/register-device).
182
+
- At any time, devices should be capable of responding to a user-initiated reprovisioning command.
183
+
184
+
Other IoT Hub scenarios when using DPS:
185
+
186
+
- IoT Hub failover: Devices should continue to work as connection information shouldn't change and logic is in place to retry the connection once the hub is available again.
187
+
- Change of IoT Hub: Assigning devices to a different IoT Hub should be done by using a [custom allocation policy](tutorial-custom-allocation-policies.md).
188
+
- Retry IoT Hub connection: You shouldn't use an aggressive retry strategy, instead allowing a gap of at least a minute before a retry.
189
+
- IoT Hub partitions: If your device strategy leans heavily on telemetry, the number of device-to-cloud partitions should be increased.
190
+
191
191
## Monitoring devices
192
192
193
193
An important part of the overall deployment is monitoring the solution end-to-end to make sure that the system is performing appropriately. There are several ways to monitor the health of a service for large-scale deployment of IoT devices. The following patterns have proven effective in monitoring the service:
194
194
195
-
- Create an application to query each enrollment group on a DPS, get the total devices registered to that group, and then aggregate the numbers from across various enrollment groups. This number provides an exact count of the devices that are currently registered via a DPS and can be used to monitor the state of the service.
196
-
- Monitor device registrations over a specific period. For instance, monitor registration rates for a DPS over the prior five days. Note that this approach only provides an approximate figure and is also capped to a time period.
195
+
- Create an application to query each enrollment group on a DPS instance, get the total devices registered to that group, and then aggregate the numbers from across various enrollment groups. This number provides an exact count of the devices that are currently registered via DPS and can be used to monitor the state of the service.
196
+
- Monitor device registrations over a specific period. For instance, monitor registration rates for a DPS instance over the prior five days. Note that this approach only provides an approximate figure and is also capped to a time period.
197
197
198
198
## Next steps
199
199
200
-
-[Provision devices across load-balanced IoT hubs](tutorial-provision-multiple-hubs.md)
200
+
-[Provision devices across load-balanced IoT Hubs](tutorial-provision-multiple-hubs.md)
201
201
-[Retry timing](https://github.com/Azure/azure-sdk-for-c/blob/main/sdk/docs/iot/mqtt_state_machine.md#retry-timing) when retrying operations
0 commit comments