Skip to content

Commit d8e55a8

Browse files
authored
Merge pull request #46241 from yzhong94/master
Add reliability doc for IoT Hub SDK
2 parents 870ff8c + 7d346b5 commit d8e55a8

File tree

2 files changed

+147
-0
lines changed

2 files changed

+147
-0
lines changed

articles/iot-hub/TOC.md

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -67,6 +67,44 @@
6767
### [X.509 CA certificate security concepts](iot-hub-x509ca-concept.md)
6868

6969
# How-to guides
70+
## Plan
71+
### [Compare IoT Hub and Event Hubs](iot-hub-compare-event-hubs.md)
72+
### [Choose the right tier](iot-hub-scaling.md)
73+
### [High availability and disaster recovery](iot-hub-ha-dr.md)
74+
### [Supporting additional protocols](iot-hub-protocol-gateway.md)
75+
### [Compare message and event routing](iot-hub-event-grid-routing-comparison.md)
76+
## [Develop](iot-hub-how-to.md)
77+
### [Developer guide](iot-hub-devguide.md)
78+
#### [Device-to-cloud feature guide](iot-hub-devguide-d2c-guidance.md)
79+
#### [Cloud-to-device feature guide](iot-hub-devguide-c2d-guidance.md)
80+
#### [Send and receive messages](iot-hub-devguide-messaging.md)
81+
##### [Send device-to-cloud messages to IoT Hub](iot-hub-devguide-messages-d2c.md)
82+
##### [Read device-to-cloud messages from the built-in endpoint](iot-hub-devguide-messages-read-builtin.md)
83+
##### [React to IoT Hub events](iot-hub-event-grid.md)
84+
##### [Use custom endpoints and routing rules for device-to-cloud messages](iot-hub-devguide-messages-read-custom.md)
85+
##### [Send cloud-to-device messages from IoT Hub](iot-hub-devguide-messages-c2d.md)
86+
##### [Create and read IoT Hub messages](iot-hub-devguide-messages-construct.md)
87+
##### [Choose a communication protocol](iot-hub-devguide-protocols.md)
88+
#### [Upload files from a device](iot-hub-devguide-file-upload.md)
89+
#### [Manage device identities](iot-hub-devguide-identity-registry.md)
90+
#### [Control access to IoT Hub](iot-hub-devguide-security.md)
91+
#### [Understand device twins](iot-hub-devguide-device-twins.md)
92+
#### [Understand module twins](iot-hub-devguide-module-twins.md)
93+
#### [Invoke direct methods on a device](iot-hub-devguide-direct-methods.md)
94+
#### [Schedule jobs on multiple devices](iot-hub-devguide-jobs.md)
95+
#### [IoT Hub endpoints](iot-hub-devguide-endpoints.md)
96+
#### [Query language](iot-hub-devguide-query-language.md)
97+
#### [Quotas and throttling](iot-hub-devguide-quotas-throttling.md)
98+
#### [Pricing examples](iot-hub-devguide-pricing.md)
99+
#### [MQTT support](iot-hub-mqtt-support.md)
100+
#### [Glossary](iot-hub-devguide-glossary.md)
101+
### [Use device and service SDKs](iot-hub-devguide-sdks.md)
102+
#### [Use the IoT device SDK for C](iot-hub-device-sdk-c-intro.md)
103+
##### [Use the IoTHubClient](iot-hub-device-sdk-c-iothubclient.md)
104+
##### [Use the serializer](iot-hub-device-sdk-c-serializer.md)
105+
#### [Develop for constrained devices](iot-hub-devguide-develop-for-constrained-devices.md)
106+
#### [Develop for mobile devices](iot-hub-how-to-develop-for-mobile-devices.md)
107+
#### [Manage connectivity and reliable messaging](iot-hub-reliability-features-in-sdks.md)
70108

71109
## Develop
72110
### [Use the IoT device SDK for C](iot-hub-device-sdk-c-intro.md)
Lines changed: 109 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,109 @@
1+
---
2+
title: How to manage connectivity and reliable messaging using Azure IoT Hub device SDKs
3+
description: Learn how to improve your device connectivity and messaging when using the Azure IoT Hub device SDKs
4+
services: iot-hub
5+
keywords:
6+
author: yzhong94
7+
ms.author: yizhon
8+
ms.date: 07/07/2018
9+
ms.topic: article
10+
ms.service: iot-hub
11+
12+
documentationcenter: ''
13+
manager: timlt
14+
ms.devlang: na
15+
ms.custom: mvc
16+
---
17+
18+
# How to manage connectivity and reliable messaging using Azure IoT Hub device SDKs
19+
20+
This guide provides high-level guidance for designing resilient device applications, by taking advantage of the connectivity and reliable messaging features in Azure IoT device SDKs. The goal of this article is to help answer questions and handle these scenarios :
21+
22+
- managing a dropped network connection
23+
- managing switching between different network connections
24+
- managing reconnection due to service transient connection errors
25+
26+
Implementation details may vary by language, see linked API documentation or specific SDK for more details.
27+
28+
- [C/Python/iOS SDK](https://github.com/azure/azure-iot-sdk-c)
29+
- [.NET SDK](https://github.com/Azure/azure-iot-sdk-csharp/blob/master/iothub/device/devdoc/requirements/retrypolicy.md)
30+
- [Java SDK](https://github.com/Azure/azure-iot-sdk-java/blob/master/device/iot-device-client/devdoc/requirement_docs/com/microsoft/azure/iothub/retryPolicy.md)
31+
- [Node SDK](https://github.com/Azure/azure-iot-sdk-node/wiki/Connectivity-and-Retries#types-of-errors-and-how-to-detect-them)
32+
33+
34+
## Designing for resiliency
35+
36+
IoT devices often rely on non-continuous and/or unstable network connections such as GSM or satellite. In addition, when interacting with cloud-based services, errors can occur due to temporary conditions such as intermittent service availability and infrastructure-level faults (commonly referred to as transient faults). An application running on a device need to manage the connection and reconnection mechanisms, as well as the retry logic for sending/receiving messages. Furthermore, the retry strategy requirements depend heavily on the IoT scenario the device participates in, and the device’s context and capabilities.
37+
38+
The Azure IoT Hub device SDKs aim to simplify connecting and communicating from cloud-to-device and device-to-cloud by providing a robust and comprehensive way of connecting and sending/receiving messages to and from Azure IoT Hub. Developers can also modify existing implementation to develop the right retry strategy for a given scenario.
39+
40+
The relevant SDK features that support connectivity and reliable messaging are covered in the following sections.
41+
42+
## Connection and retry
43+
44+
This section provides an overview of the reconnection and retry patterns available when managing connections, implementation guidance for using different retry policy in your device application, and relevant APIs for the device SDKs.
45+
46+
### Error patterns
47+
Connection failures can happen in many levels:
48+
49+
- Network errors such as a disconnected socket and name resolution errors
50+
- Protocol-level errors for HTTP, AMQP, and MQTT transport such as links detached or sessions expired
51+
- Application-level errors that result from either local mistakes such as invalid credentials or service behavior such as exceeding quota or throttling
52+
53+
The device SDKs detect errors in all three levels. OS-related errors and hardware errors are not detected and handled by the device SDKs. The design is based on [The Transient Fault Handling Guidance](https://docs.microsoft.com/azure/architecture/best-practices/transient-faults#general-guidelines) from Azure Architecture Center.
54+
55+
### Retry patterns
56+
57+
The overall process for retry when connection errors are detected is:
58+
1. The SDK detects the error and the associated error in network, protocol, or application.
59+
2. Based on the error type, the SDK uses the error filter to decide if retry needs to be performed. If an **unrecoverable error** is identified by the SDK, operations (connection and send/receive) will be stopped and the SDK will notify the user. An unrecoverable error is an error that the SDK can identify and determine that it cannot be recovered, for example, an authentication or bad endpoint error.
60+
3. If a **recoverable error** is identified, the SDK begins to retry using the retry policy specified until a defined timeout expires.
61+
4. When the defined timeout expires, the SDK stops trying to connect or send, and notifies the user.
62+
5. The SDK allows the user to attach a callback to receive connection status changes.
63+
64+
Three retry policies are provided:
65+
- **Exponential back-off with jitter**: This is the default retry policy applied. It tends to be aggressive at the start, slows down, and then hits a maximum delay that is not exceeded. The design is based on [Retry guidance from Azure Architecture Center](https://docs.microsoft.com/azure/architecture/best-practices/retry-service-specific).
66+
- **Custom retry**: You can implement a custom retry policy and inject it in the RetryPolicy depending on the language you choose. You can design a retry policy that is suited for your scenario. This is not available on the C SDK.
67+
- **No retry**: There is an option to set retry policy to "no retry," which disables the retry logic. The SDK tries to connect once and send a message once, assuming the connection is established. This policy would typically be used in cases where there are bandwidth or cost concerns. If this option is chosen, messages that fail to send are lost and cannot be recovered.
68+
69+
### Retry policy APIs
70+
71+
| SDK | SetRetryPolicy method | Policy implementations | Implementation guidance |
72+
|-----|----------------------|--|--|
73+
| C/Python/iOS | [IOTHUB_CLIENT_RESULT IoTHubClient_SetRetryPolicy](https://github.com/Azure/azure-iot-sdk-c/blob/2018-05-04/iothub_client/inc/iothub_client.h#L188) | **Default**: [IOTHUB_CLIENT_RETRY_EXPONENTIAL_BACKOFF](https://github.com/Azure/azure-iot-sdk-c/blob/master/doc/connection_and_messaging_reliability.md#connection-retry-policies)<BR>**Custom:** use available [retryPolicy](https://github.com/Azure/azure-iot-sdk-c/blob/master/doc/connection_and_messaging_reliability.md#connection-retry-policies)<BR>**No retry:** [IOTHUB_CLIENT_RETRY_NONE](https://github.com/Azure/azure-iot-sdk-c/blob/master/doc/connection_and_messaging_reliability.md#connection-retry-policies) | [C/Python/iOS implementation](https://github.com/Azure/azure-iot-sdk-c/blob/master/doc/connection_and_messaging_reliability.md#) |
74+
| Java| [SetRetryPolicy](https://docs.microsoft.com/en-us/java/api/com.microsoft.azure.sdk.iot.device._device_client_config.setretrypolicy?view=azure-java-stable) | **Default**: [ExponentialBackoffWithJitter class](https://github.com/Azure/azure-iot-sdk-java/blob/master/device/iot-device-client/src/main/java/com/microsoft/azure/sdk/iot/device/transport/NoRetry.java)<BR>**Custom:** implement [RetryPolicy interface](https://github.com/Azure/azure-iot-sdk-java/blob/master/device/iot-device-client/src/main/java/com/microsoft/azure/sdk/iot/device/transport/RetryPolicy.java)<BR>**No retry:** [NoRetry class](https://github.com/Azure/azure-iot-sdk-java/blob/master/device/iot-device-client/src/main/java/com/microsoft/azure/sdk/iot/device/transport/NoRetry.java) | [Java implementation](https://github.com/Azure/azure-iot-sdk-java/blob/master/device/iot-device-client/devdoc/requirement_docs/com/microsoft/azure/iothub/retryPolicy.md) |[.NET SDK](https://github.com/Azure/azure-iot-sdk-csharp/blob/master/iothub/device/devdoc/requirements/retrypolicy.md)
75+
| .NET| [DeviceClient.SetRetryPolicy](/dotnet/api/microsoft.azure.devices.client.deviceclient.setretrypolicy?view=azure-dotnet#Microsoft_Azure_Devices_Client_DeviceClient_SetRetryPolicy_Microsoft_Azure_Devices_Client_IRetryPolicy) | **Default**: [ExponentialBackoff class](/dotnet/api/microsoft.azure.devices.client.exponentialbackoff?view=azure-dotnet)<BR>**Custom:** implement [IRetryPolicy interface](https://docs.microsoft.com/dotnet/api/microsoft.azure.devices.client.iretrypolicy?view=azure-dotnet)<BR>**No retry:** [NoRetry class](/dotnet/api/microsoft.azure.devices.client.noretry?view=azure-dotnet) | [C# implementation]() |
76+
| Node| [setRetryPolicy](/javascript/api/azure-iot-device/client?view=azure-iot-typescript-latest#azure_iot_device_Client_setRetryPolicy) | **Default**: [ExponentialBackoffWithJitter class](/javascript/api/azure-iot-common/exponentialbackoffwithjitter?view=azure-iot-typescript-latest)<BR>**Custom:** implement [RetryPolicy interface](/javascript/api/azure-iot-common/retrypolicy?view=azure-iot-typescript-latest)<BR>**No retry:** [NoRetry class](/javascript/api/azure-iot-common/noretry?view=azure-iot-typescript-latest) | [Node implementation](https://github.com/Azure/azure-iot-sdk-node/wiki/Connectivity-and-Retries#types-of-errors-and-how-to-detect-them) |
77+
78+
79+
Below are code samples that illustrate this flow.
80+
81+
#### .NET implementation guidance
82+
83+
The code sample below shows how to define and set the default retry policy:
84+
85+
```csharp
86+
# define/set default retry policy
87+
RetryPolicy retryPolicy = new ExponentialBackoff(int.MaxValue, TimeSpan.FromMilliseconds(100), TimeSpan.FromSeconds(10), TimeSpan.FromMilliseconds(100));
88+
SetRetryPolicy(retryPolicy);
89+
```
90+
91+
To avoid high CPU usage, the retries are throttled if the code fails immediately (for example, when there is no network or route to destination) so that the minimum time to execute the next retry is 1 second.
92+
93+
If the service is responding with a throttling error, the retry policy is different and cannot be changed via public API:
94+
95+
```csharp
96+
# throttled retry policy
97+
RetryPolicy retryPolicy = new ExponentialBackoff(RetryCount, TimeSpan.FromSeconds(10), TimeSpan.FromSeconds(60), TimeSpan.FromSeconds(5));
98+
SetRetryPolicy(retryPolicy);
99+
```
100+
101+
The retry mechanism will stop after `DefaultOperationTimeoutInMilliseconds`, which is currently set at 4 minutes.
102+
103+
#### Other languages implementation guidance
104+
For other languages, review the implementation documentation below. Samples demonstrating the use of retry policy APIs are provided in the repository.
105+
- [C/Python/iOS SDK](https://github.com/azure/azure-iot-sdk-c)
106+
- [.NET SDK](https://github.com/Azure/azure-iot-sdk-csharp/blob/master/iothub/device/devdoc/requirements/retrypolicy.md)
107+
- [Java SDK](https://github.com/Azure/azure-iot-sdk-java/blob/master/device/iot-device-client/devdoc/requirement_docs/com/microsoft/azure/iothub/retryPolicy.md)
108+
- [Node SDK](https://github.com/Azure/azure-iot-sdk-node/wiki/Connectivity-and-Retries#types-of-errors-and-how-to-detect-them)
109+

0 commit comments

Comments
 (0)