Skip to content

Commit e615c83

Browse files
committed
doc: update CHUNKS.md with extended layer info
Signed-off-by: Eduardo Silva <[email protected]>
1 parent ef6d58a commit e615c83

File tree

1 file changed

+107
-1
lines changed

1 file changed

+107
-1
lines changed

CHUNKS.md

Lines changed: 107 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,7 @@ The following is the layout of a Chunk in the file system:
4040
| 4 BYTES CRC32 + 16 BYTES +--> CRC32(Content) + Padding
4141
+-------------------------------+
4242
| Content |
43-
| +-------------------------+ |
43+
| +------------------------+ |
4444
| | 2 BYTES +-----> Metadata Length
4545
| +-------------------------+ |
4646
| +-------------------------+ |
@@ -91,6 +91,112 @@ Content Data < | | records | |
9191
Fluent Bit API provides backward compatibility with the previous metadata and content
9292
format found on series v1.8.
9393

94+
Starting with the Fluent Bit release that introduces direct route persistence, the
95+
fourth metadata byte now carries feature flags. A zero value preserves the legacy
96+
layout, while a non-zero value indicates that additional structures follow the tag.
97+
When the ``FLB_CHUNK_FLAG_DIRECT_ROUTES`` bit is set the tag is terminated with a
98+
single ``\0`` byte and a routing payload is appended. Fluent Bit v4.2 and later
99+
also set the ``FLB_CHUNK_FLAG_DIRECT_ROUTE_LABELS`` bit to store each destination's
100+
alias (or generated name) alongside its numeric identifier so routes can survive
101+
configuration changes that renumber outputs. If any stored identifier exceeds
102+
65535 the ``FLB_CHUNK_FLAG_DIRECT_ROUTE_WIDE_IDS`` bit is enabled and each ID is
103+
encoded using four bytes so large configurations remain routable after a restart.
104+
When plugin names are stored, the ``FLB_CHUNK_FLAG_DIRECT_ROUTE_PLUGIN_IDS`` bit is
105+
set to enable type-safe routing by matching plugin names:
106+
107+
```
108+
+--------------------------+-------+
109+
| 0xF1 | 0x77 | <- Magic Bytes
110+
+--------------------------+-------+
111+
| Type | Flags | <- Chunk type and flag bits
112+
+--------------------------+--------------------------+
113+
| Tag string (no size prefix) | <- Tag associated to records
114+
+-----------------------------------------------------+
115+
| 0x00 (Tag terminator) | <- Present when flags != 0
116+
Routing Payload Start ----+-----------------------------------------------------+
117+
| Routing Length (uint16_t big endian) | <- Total size of routing
118+
| | payload (excluding this
119+
| | 2-byte field)
120+
+-----------------------------------------------------+
121+
| Route Count (uint16_t big endian) | <- Number of output
122+
| | destinations stored
123+
+-----------------------------------------------------+
124+
| Output IDs (route_count entries) | <- Each stored as uint16_t
125+
| | (big endian) or uint32_t
126+
| | when FLB_CHUNK_FLAG_
127+
| | DIRECT_ROUTE_WIDE_IDS
128+
+-----------------------------------------------------+
129+
| Label Lengths (route_count entries) | <- Present when FLB_CHUNK_
130+
| | FLAG_DIRECT_ROUTE_LABELS
131+
| | Each uint16_t big endian
132+
| | with bit 15 encoding alias
133+
| | flag (0x8000) and bits
134+
| | 0-14 the length (0x7FFF)
135+
+-----------------------------------------------------+
136+
| Label Strings (concatenated, no null) | <- Present when FLB_CHUNK_
137+
| | FLAG_DIRECT_ROUTE_LABELS
138+
| | Variable length
139+
+-----------------------------------------------------+
140+
| Plugin Name Lengths (route_count entries) | <- Present when FLB_CHUNK_
141+
| | FLAG_DIRECT_ROUTE_PLUGIN_IDS
142+
| | Each uint16_t big endian
143+
+-----------------------------------------------------+
144+
| Plugin Name Strings (concatenated, no null) | <- Present when FLB_CHUNK_
145+
| | FLAG_DIRECT_ROUTE_PLUGIN_IDS
146+
| | Variable length
147+
| |
148+
Routing Payload End -----+-----------------------------------------------------+
149+
```
150+
151+
The routing payload captures the direct route mapping so that filesystem chunks
152+
loaded by the storage backlog re-use the same outputs after a restart. Chunks
153+
without direct routes keep the legacy layout (flags byte set to zero) and remain
154+
fully backwards compatible across Fluent Bit versions. When labels are stored the
155+
reader first reconstructs routes by matching aliases or numbered names and only
156+
falls back to numeric identifiers if the textual metadata cannot be matched. This
157+
ensures that chunks continue to flow to the intended destinations even when the
158+
output configuration is re-ordered.
159+
160+
**Routing Payload Structure**: The routing payload begins immediately after the tag
161+
terminator and extends for the number of bytes specified by the Routing Length
162+
field. The Routing Length field stores the total size of all routing data (excluding
163+
the 2-byte Routing Length field itself), including the Route Count field, all
164+
Output IDs, Label Lengths (if present), Label Bytes (if present), Plugin Lengths
165+
(if present), and Plugin Bytes (if present). The Route Count field indicates how
166+
many output destinations are encoded in the routing payload. Each route entry
167+
consists of one Output ID, optionally followed by one Label Length entry and its
168+
corresponding Label Bytes, and optionally followed by one Plugin Length entry and
169+
its corresponding Plugin Bytes. All routes are stored sequentially, with arrays
170+
of lengths preceding their corresponding string data blocks.
171+
172+
**Labels**: Labels are textual identifiers used to match output instances when
173+
restoring routes from chunk metadata. They provide a stable way to identify
174+
outputs that survives configuration changes, unlike numeric IDs which can be
175+
reassigned when outputs are reordered. Labels come in two forms: aliases and
176+
generated names. An alias is a user-provided identifier set via the ``Alias``
177+
configuration property, explicitly chosen by the user to identify a specific
178+
output instance. A generated name is automatically created when no alias is
179+
provided, following the pattern ``{plugin_name}.{sequence_number}`` (e.g.,
180+
``stdout.0``, ``stdout.1``, ``http.0``). The system stores the alias if one
181+
exists, otherwise falls back to the generated name. When restoring routes, the
182+
reader first attempts to match stored labels against current output aliases,
183+
then against current generated names, and only falls back to numeric ID matching
184+
if no label was stored. This label-based matching ensures that chunks continue
185+
routing to the correct outputs even when output IDs change due to configuration
186+
reordering, making the routing resilient to configuration changes.
187+
188+
**Label Length Encoding**: When labels are present, each label length is stored as
189+
a 16-bit big-endian value with the most significant bit (0x8000) encoding whether
190+
the label represents an alias (1) or a generated name (0). The actual length is
191+
encoded in the lower 15 bits (0x7FFF). This allows the reader to distinguish
192+
between user-provided aliases and auto-generated names when matching routes.
193+
194+
**Plugin Names**: When plugin names are stored (``FLB_CHUNK_FLAG_DIRECT_ROUTE_PLUGIN_IDS``),
195+
each route includes the plugin type name (e.g., "stdout", "http") to enable type-safe
196+
matching. This prevents routing to outputs of different plugin types that might share
197+
the same alias or name. Plugin name lengths are stored as 16-bit big-endian values
198+
followed by the concatenated plugin name strings without null terminators.
199+
94200
### Fluent Bit <= v1.8
95201

96202
Up to Fluent Bit <= 1.8.x, the metadata and content data is simple, where metadata

0 commit comments

Comments
 (0)