|
| 1 | +# ADR 20: Large Message Chunking in MQTT Protocol |
| 2 | + |
| 3 | +## Status |
| 4 | +Proposed |
| 5 | + |
| 6 | +## Context |
| 7 | +The MQTT protocol has inherent message size limitations imposed by brokers and network constraints. Azure IoT Operations scenarios often require transmitting payloads that exceed these limits (e.g., firmware updates, large telemetry batches, complex configurations). Without a standardized chunking mechanism, applications must implement their own fragmentation strategies, leading to inconsistent implementations and interoperability issues. |
| 8 | + |
| 9 | +## Decision |
| 10 | +We will implement protocol-level message chunking as part of the MQTT protocol layer by using MQTT user properties to carry chunk metadata. This approach will make the chunking mechanism explicit in the protocol rather than hiding it in higher or lower layers. |
| 11 | + |
| 12 | +The chunking mechanism will: |
| 13 | +1. Be applied only to MQTT PUBLISH packets |
| 14 | +2. Use standardized user properties for chunk metadata: |
| 15 | + - `__ci`: Chunk index in the sequence (0-based) |
| 16 | + - `__tc`: Total number of chunks in the complete message |
| 17 | + - `__mid`: Original message identifier for reassembly |
| 18 | + - `__cs`: Optional hash/checksum of complete payload |
| 19 | + |
| 20 | +### Protocol Flow |
| 21 | +**Sending Process:** |
| 22 | +- When a payload exceeds the maximum packet size, the client intercepts it before transmission |
| 23 | +- The message is split into fixed-size chunks (with potentially smaller last chunk) |
| 24 | +- Each chunk is sent as a separate MQTT message with the same topic but with chunk metadata. |
| 25 | +- Any user properties and additional metadata not mandated by the MQTT protocol to appear in every message, originally set on the initial PUBLISH packet, will be included only in the first chunk. |
| 26 | +- QoS settings are maintained across all chunks. |
| 27 | + |
| 28 | +**Receiving Process:** |
| 29 | + - The MQTT client receives messages and identifies chunked messages by the presence of chunk metadata. |
| 30 | + - Chunks are stored in a temporary buffer, indexed by message ID (`__mid`) and chunk index (`__ci`). |
| 31 | + - When all chunks for a message ID are received, they are reassembled in order. |
| 32 | + - The reconstructed message is then processed as a single message by the application callback. |
| 33 | + |
| 34 | +## Consequences |
| 35 | + |
| 36 | +### Benefits |
| 37 | +- **Standards-Based:** Uses existing MQTT features rather than custom transport mechanisms |
| 38 | +- **Protocol Transparent:** Makes chunking behavior explicit in the protocol |
| 39 | +- **Property Preservation:** Maintains topic, QoS, and other message properties consistently |
| 40 | +- **Network Optimized:** Allows efficient transmission of large payloads over constrained networks |
| 41 | + |
| 42 | +### Implementation Considerations |
| 43 | +- **Error Handling:** |
| 44 | + - Chunk timeout mechanisms |
| 45 | + - Missing chunk detection |
| 46 | + - Error propagation to application code |
| 47 | +- **Performance Optimization:** |
| 48 | + - Dynamic chunk sizing based on broker limitations |
| 49 | + - Concurrent chunk transmission |
| 50 | + - Efficient memory usage during reassembly |
| 51 | +- **Security:** |
| 52 | + - Validate message integrity across chunks |
| 53 | + - Prevent chunk injection attacks |
| 54 | + |
| 55 | +## Open Questions |
| 56 | +1. How do we determine the optimal chunk size? Should it be based on the broker's max size, network conditions, or configurable by the application? |
| 57 | +2. Do we create a new API method (`PublishLargeAsync()`) or use the existing `PublishAsync()` API with transparent chunking for oversized payloads? |
0 commit comments