Skip to content
Open
Changes from 9 commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
8644e85
Add ADR for large message chunking in MQTT protocol
maximsemenov80 Mar 17, 2025
6de0c8e
accept suggestion
maximsemenov80 Mar 18, 2025
75a767c
accept suggestion
maximsemenov80 Mar 18, 2025
e284083
Address review comments
maximsemenov80 Mar 19, 2025
8c29c44
progress
maximsemenov80 Mar 24, 2025
2251ff3
progress
maximsemenov80 Mar 24, 2025
a8a9cc1
Address coments to large message chunking implementation details
maximsemenov80 May 13, 2025
4967150
progress
maximsemenov80 May 13, 2025
0af2742
Update doc/dev/adr/0020-large-message-chunking.md
maximsemenov80 May 14, 2025
cc724bb
Address review comments
maximsemenov80 Jun 5, 2025
11de62b
progress
maximsemenov80 Jun 6, 2025
0bb4de6
progress
maximsemenov80 Jun 6, 2025
18b36fe
Enhance large message chunking documentation with failure handling an…
maximsemenov80 Jun 9, 2025
833774f
Clarify handling of message chunks for non-chunking-aware clients in …
maximsemenov80 Jun 9, 2025
0199e10
Update 0020-large-message-chunking.md
maximsemenov80 Jun 9, 2025
448f8e2
Update 0020-large-message-chunking.md
maximsemenov80 Jun 10, 2025
aa8e946
Update 0020-large-message-chunking.md
maximsemenov80 Jun 10, 2025
8e9c760
Update 0020-large-message-chunking.md
maximsemenov80 Jun 11, 2025
e323780
rename file
maximsemenov80 Jul 9, 2025
f6894f8
Update 0023-large-message-chunking.md
maximsemenov80 Jul 9, 2025
98baf9b
Update 0023-large-message-chunking.md
maximsemenov80 Jul 9, 2025
f95fe5c
Update 0023-large-message-chunking.md
maximsemenov80 Jul 9, 2025
788dc3d
Update 0023-large-message-chunking.md
maximsemenov80 Aug 5, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
100 changes: 100 additions & 0 deletions doc/dev/adr/0020-large-message-chunking.md

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel a key question we need to answer is - when can the client chunk, vs not.
This will depend on whether the receiver is able to understand our chunking protocol.
So we need to enable this only when both sides of the communication pipe use the same mechanism. One example is mRPC. Telemetry can also be applicable, but I suppose telemetry could also be asymmetric?

Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
# ADR 20: Large Message Chunking in MQTT Protocol

## Status
Proposed

## Context
The MQTT protocol has inherent message size limitations imposed by brokers and network constraints. Azure IoT Operations scenarios often require transmitting payloads that exceed these limits (e.g., firmware updates, large telemetry batches, complex configurations). Without a standardized chunking mechanism, applications must implement their own fragmentation strategies, leading to inconsistent implementations and interoperability issues.

## Decision
We will implement sdk-level message chunking as part of the Protocol layer to transparently handle messages exceeding the MQTT broker's maximum packet size.

**The chunking mechanism will**:
- Be enabled/disabled by a configuration setting.
- Use standardized user properties for chunk metadata:
- The `__chunk` user property will contain a JSON object with chunking metadata.
- The JSON structure will include:
```json
{
"messageId": "unique-id-for-chunked-message",
"chunkIndex": 0,
"timeout" : "00:00:10",
"totalChunks": 5,
"checksum": "message-hash"
}
```
- `messageId, chunkIndex, timeout` - present for every chunk; `totalChunks, checksum` - present only for the first chunk.

**Chunk size calculation**:
- Maximum chunk size will be derived from the MQTT CONNECT packet's Maximum Packet Size.
- A static overhead value will be subtracted from the Maximum Packet Size to account for MQTT packet headers, topic name, user properties, and other metadata.
- The overhead size will be configurable, large enough to simplify calculations while ensuring we stay under the broker's limit.

**Implementation layer**:
- Chunking will be implemented as middleware in the Protocol layer between serialization and MQTT client.
```
Application → Protocol Layer (Serialization) → Chunking Middleware → MQTT Client → Broker
```
- This makes chunking transparent to application code and compatible with all serialization formats.

- Sending Process:
- When a payload exceeds the maximum packet size, the middleware intercepts it before transmission
- The message is split into fixed-size chunks (with potentially smaller last chunk)
- Each chunk is sent as a separate MQTT message with the same topic but with chunk metadata.
- Effort should be made to minimize user properties copied over to every chunk: first chunk will have full set of original user properties and the rest only thoses that are neccessary to reassamble original message (ex.: ```$partition``` property to support shared subscriptions:).

Check warning on line 44 in doc/dev/adr/0020-large-message-chunking.md

View workflow job for this annotation

GitHub Actions / CI-spelling

Unknown word (reassamble) Suggestions: (reassemble, reassembly, reassembled, reassembles, erasable)

Check warning on line 44 in doc/dev/adr/0020-large-message-chunking.md

View workflow job for this annotation

GitHub Actions / CI-spelling

Misspelled word (neccessary) Suggestions: (necessary*, necessity, necessarily, necessary's)

Check warning on line 44 in doc/dev/adr/0020-large-message-chunking.md

View workflow job for this annotation

GitHub Actions / CI-spelling

Unknown word (thoses) Suggestions: (those, theses, tholes, hoses, choses)
- QoS settings are maintained across all chunks.
- Receiving Process:
- The Chunking aware client receives messages and identifies chunked messages by the presence of chunk metadata.
- Chunks are stored in a temporary buffer, indexed by message ID and chunk index.
- When all chunks for a message ID are received, they are reassembled in order and message checksum verified.
- The reconstructed message is then processed as a single message by the application.

### Benefits
- **Property Preservation:** Maintains topic, QoS, and other message properties consistently
- **Network Optimized:** Allows efficient transmission of large payloads over constrained networks

### Implementation Considerations
- **Error Handling:**
- Chunk timeout mechanisms (see Chunk Timeout Mechanism Options in the Appendix)
- Error propagation to application code
- **Performance Optimization:**
- Concurrent chunk transmission
- Efficient memory usage during reassembly
- **Security:**
- Validate message integrity across chunks and prevent chunk injection attacks (see Checksum Algorithm Options for MQTT Message Chunking in the Appendix)

# Appendix

## Chunk Timeout Mechanism Options

1. Fixed Timeout Window
- Set a single timeout period after receiving the first chunk
- If all chunks aren't received within this window, the message is considered failed
- **Pros**: Simple implementation, predictable behavior
- **Cons**: Not adaptive to message size or network conditions

2. Sliding Timeout Window
- Reset the timeout each time a new chunk arrives
- Only expire the chunked message if there's a long gap between chunks
- **Pros**: Tolerates varying network conditions and delivery rates
- **Cons**: Could keep resources allocated for extended periods

## Checksum Algorithm Options for MQTT Message Chunking

1. MD5
- **Description**: 128-bit hash function
- **Pros**: Good performance, reasonable size (16 bytes), widely implemented
- **Cons**: No longer considered cryptographically secure
- **Best for**: Basic integrity verification without security requirements

2. SHA-256
- **Description**: Secure hash algorithm producing 256-bit output
- **Pros**: Cryptographically secure, widely supported in all target languages
- **Cons**: Larger output size (32 bytes), more computation required
- **Best for**: Applications requiring message security and tamper protection

3. BLAKE2b
- **Description**: Modern cryptographic hash function
- **Pros**: Faster than MD5 but with security comparable to SHA-3
- **Cons**: May not be as universally available in standard libraries
- **Best for**: Performance-critical applications that still need security
Loading