Skip to content

Commit c08a121

Browse files
Add ADR for large message chunking in MQTT protocol
1 parent 9a44deb commit c08a121

File tree

1 file changed

+57
-0
lines changed

1 file changed

+57
-0
lines changed
Lines changed: 57 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,57 @@
1+
# ADR 20: Large Message Chunking in MQTT Protocol
2+
3+
## Status
4+
Proposed
5+
6+
## Context
7+
The MQTT protocol has inherent message size limitations imposed by brokers and network constraints. Azure IoT Operations scenarios often require transmitting payloads that exceed these limits (e.g., firmware updates, large telemetry batches, complex configurations). Without a standardized chunking mechanism, applications must implement their own fragmentation strategies, leading to inconsistent implementations and interoperability issues.
8+
9+
## Decision
10+
We will implement protocol-level message chunking as part of the MQTT protocol layer by using MQTT user properties to carry chunk metadata. This approach will make the chunking mechanism explicit in the protocol rather than hiding it in higher or lower layers.
11+
12+
The chunking mechanism will:
13+
1. Be applied only to MQTT PUBLISH packets
14+
2. Use standardized user properties for chunk metadata:
15+
- `__ci`: Chunk index in the sequence (0-based)
16+
- `__tc`: Total number of chunks in the complete message
17+
- `__mid`: Original message identifier for reassembly
18+
- `__cs`: Optional hash/checksum of complete payload
19+
20+
### Protocol Flow
21+
**Sending Process:**
22+
- When a payload exceeds the maximum packet size, the client intercepts it before transmission
23+
- The message is split into fixed-size chunks (with potentially smaller last chunk)
24+
- Each chunk is sent as a separate MQTT message with the same topic but with chunk metadata.
25+
- Any user properties and additional metadata not mandated by the MQTT protocol to appear in every message, originally set on the initial PUBLISH packet, will be included only in the first chunk.
26+
- QoS settings are maintained across all chunks.
27+
28+
**Receiving Process:**
29+
- The MQTT client receives messages and identifies chunked messages by the presence of chunk metadata.
30+
- Chunks are stored in a temporary buffer, indexed by message ID (`__mid`) and chunk index (`__ci`).
31+
- When all chunks for a message ID are received, they are reassembled in order.
32+
- The reconstructed message is then processed as a single message by the application callback.
33+
34+
## Consequences
35+
36+
### Benefits
37+
- **Standards-Based:** Uses existing MQTT features rather than custom transport mechanisms
38+
- **Protocol Transparent:** Makes chunking behavior explicit in the protocol
39+
- **Property Preservation:** Maintains topic, QoS, and other message properties consistently
40+
- **Network Optimized:** Allows efficient transmission of large payloads over constrained networks
41+
42+
### Implementation Considerations
43+
- **Error Handling:**
44+
- Chunk timeout mechanisms
45+
- Missing chunk detection
46+
- Error propagation to application code
47+
- **Performance Optimization:**
48+
- Dynamic chunk sizing based on broker limitations
49+
- Concurrent chunk transmission
50+
- Efficient memory usage during reassembly
51+
- **Security:**
52+
- Validate message integrity across chunks
53+
- Prevent chunk injection attacks
54+
55+
## Open Questions
56+
1. How do we determine the optimal chunk size? Should it be based on the broker's max size, network conditions, or configurable by the application?
57+
2. Do we create a new API method (`PublishLargeAsync()`) or use the existing `PublishAsync()` API with transparent chunking for oversized payloads?

0 commit comments

Comments
 (0)