chunks: extend metadata to support routing information (direct routes) #11097

edsiper · 2025-11-02T18:51:49Z

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

Summary by CodeRabbit

New Features
- Chunks can persist and restore direct output routes (including optional labels/aliases and extended IDs), preserving destinations across restarts and falling back to tag-based routing with warnings if targets are missing.
Documentation
- Expanded metadata and routing format docs covering flags, route payload layout, legacy behavior, and lifecycle across restarts.
Improvements
- More robust file-mapping, metadata-write and resize handling with improved warnings and error propagation.
Tests
- Added unit/integration tests for direct-route metadata, metadata updates, tag handling, and platform file-mapping.

coderabbitai · 2025-11-02T18:52:07Z

Caution

Review failed

The pull request is closed.

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

Adds persistent direct-route metadata to chunk headers, reader/writer APIs to store/parse routes (IDs, optional labels/plugins), task-time restoration that resolves outputs by id/label/alias/plugin, and multiple Chunk I/O robustness and test updates for mapping, metadata growth, and Windows behavior.

Changes

Cohort / File(s)	Summary
Docs `CHUNKS.md`	Expanded metadata spec: new feature-flags byte, null-terminated tag when flags != 0, routing payload layout (routing length, route count, per-route 16/32-bit IDs), optional label/plugin length+strings, alias encoding and legacy compatibility notes.
Public API / Headers `include/fluent-bit/flb_input_chunk.h`	Added forward declarations, new `FLB_CHUNK_FLAG_DIRECT_ROUTE_*` flags and label constants, `struct flb_chunk_direct_route`, and prototypes: `flb_input_chunk_write_header_v2`, `flb_chunk_route_plugin_matches`, `flb_input_chunk_has_direct_routes`, `flb_input_chunk_get_direct_routes`, `flb_input_chunk_destroy_direct_routes`.
Chunk I/O core `lib/chunkio/src/cio_file.c`, `lib/chunkio/include/chunkio/cio_file.h`, `lib/chunkio/src/cio_file_unix.c`, `lib/chunkio/src/cio_file_win32.c`	Added `ctx`, `auto_remap_warned`, `map_truncated_warned` to `cio_file`; tightened mmap/munmap/unmap/sync logic, improved mapping-size handling (Windows/Unix), error propagation, and truncation/remap warnings.
Input chunk core `src/flb_input_chunk.c`	Added metadata view parser (`flb_input_chunk_meta_view`), magic-byte detection, direct-route metadata parsing/getters, updated `flb_input_chunk_get_tag` to use metadata view, and `flb_input_chunk_write_header_v2` to write v2 headers.
Writer integration `src/flb_input_log.c`	Collects eligible direct routes (IDs, optional labels and plugin names), builds route array and calls `flb_input_chunk_write_header_v2` to persist routing metadata; handles allocations/errors and cleanup.
Task restoration `src/flb_task.c`	New resolver helper to match outputs by id/label/alias/plugin; `flb_task_create` extended to load/validate/restore direct routes (create INACTIVE per-route entries) or fall back to tag-based routing with proper cleanup and logging.
Tests `tests/internal/CMakeLists.txt`, `tests/internal/input_chunk_routes.c`, `lib/chunkio/tests/*`	Added unit tests `input_chunk_routes.c` for direct-route round-trip and alias/plugin matching; added Chunk I/O tests (`metadata_update.c`, unsigned-underflow test, `fs_windows.c`) and registered them in CMakeLists.
CI / Packaging `lib/chunkio/.github/workflows/ci.yaml`, `lib/chunkio/CMakeLists.txt`	Updated Windows runner matrix and MSVC setup, switched Windows build to CMake/NMake invocation, updated test invocation, and bumped CIO patch version.

Sequence Diagram(s)

sequenceDiagram
    participant Writer as Input Log Writer
    participant ChunkFS as Chunk (fs-backed)
    participant Task as flb_task_create
    participant Outputs as Output Registry

    Writer->>Writer: collect direct-route entries (id, label?, plugin?)
    Writer->>ChunkFS: flb_input_chunk_write_header_v2(tag, routes[])
    ChunkFS->>ChunkFS: persist flags + null-terminated tag + routing payload

    Note right of Task: restart / map time
    Task->>ChunkFS: flb_input_chunk_has_direct_routes?
    alt direct routes present
        ChunkFS-->>Task: yes
        Task->>ChunkFS: flb_input_chunk_get_direct_routes -> routes[]
        loop per route
            Task->>Outputs: resolve by id or label/alias, check plugin match
            Outputs-->>Task: output instance(s) or missing
        end
        Task->>Task: attach INACTIVE per-route entries for resolved outputs
    else none/invalid
        ChunkFS-->>Task: no/invalid
        Task->>Task: fall back to tag-based routing
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Areas requiring extra attention:
- Metadata layout: offsets, endianness, 16/32-bit ID handling, and null-termination boundaries.
- Ownership and lifecycle of allocated flb_chunk_direct_route entries and label/plugin strings.
- flb_task_create restoration logic and fallback paths (race/error/partial-restore handling).
- Chunk I/O mmap/unmap behavior and platform-specific mapping size adjustments (Windows vs Unix).
- New and expanded unit tests correctness and teardown robustness.

Possibly related PRs

router: add contexts to conditional support for direct route paths #11091 — Overlaps direct-route handling and task restoration; touches similar flb_task.c paths.
router: add conditional support for Logs #11055 — Related modifications to chunk routing and task creation code.
cmake: backport #10778 and #10786 to 4.0. #10795 — Related changes around flb_input_chunk declarations and chunk metadata handling.

Suggested reviewers

koleini
fujimotos

Poem

🐰 I hopped through flags and bytes with cheer,

I tucked route IDs and labels snug and tight,
A null-terminated tag to guide on restart,
When routes are missing tags will light the way,
A little hop — persistence saved the day.

Pre-merge checks and finishing touches

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and concisely describes the main change: extending chunk metadata to support direct route persistence for routing information.

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between aced5ff and 4e101e9.

📒 Files selected for processing (1)

tests/internal/input_chunk_routes.c (1 hunks)

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2025-11-02T18:55:29Z

src/flb_input_log.c

+            }
+
+            if (direct_index < direct_count) {
+                direct_ids[direct_index] = (uint8_t) route_path->ins->id;
+                direct_index++;


Avoid truncating output IDs in direct-route metadata

When persisting direct routes the code stores each destination’s id in a uint8_t buffer (direct_ids[direct_index] = (uint8_t) route_path->ins->id). Output instance IDs are plain ints and are assigned sequentially without an 8‑bit limit, so deployments with 256+ outputs will wrap IDs when writing metadata. After a restart these truncated values can either fail to resolve or resolve to the wrong output, causing chunks to be routed incorrectly or silently fall back to default routing. Use a wider type (e.g., uint16_t or int) or validate that IDs fit in 8 bits before serializing.

Useful? React with 👍 / 👎.

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (1)

CHUNKS.md (1)
100-117: Add a language hint to the new fenced block.

markdownlint (MD040) flags this new fence; adding a language like text keeps the pipeline green without changing the rendering.
-```
+```text
 ...
-```
+```

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8b7cfba and a8b7f5d.

📒 Files selected for processing (8)

CHUNKS.md (1 hunks)
include/fluent-bit/flb_input_chunk.h (3 hunks)
lib/chunkio/src/cio_file.c (1 hunks)
src/flb_input_chunk.c (8 hunks)
src/flb_input_log.c (2 hunks)
src/flb_task.c (3 hunks)
tests/internal/CMakeLists.txt (1 hunks)
tests/internal/input_chunk_routes.c (1 hunks)

🧰 Additional context used

🧬 Code graph analysis (5)

src/flb_input_log.c (3)

src/flb_routes_mask.c (1)

flb_routes_mask_get_bit (125-140)

src/flb_input_chunk.c (2)

flb_input_chunk_write_header_v2 (1074-1151)

flb_input_chunk_get_name (2469-2475)

include/fluent-bit/flb_mem.h (1)

flb_free (126-128)

src/flb_task.c (4)

src/flb_input_chunk.c (3)

flb_input_chunk_has_direct_routes (1153-1176)

flb_input_chunk_get_direct_routes (1178-1235)

flb_input_chunk_get_name (2469-2475)

include/fluent-bit/flb_mem.h (2)

flb_calloc (84-96)

flb_free (126-128)

src/flb_router_condition.c (1)

flb_router_chunk_context_destroy (187-190)

src/flb_input.c (1)

flb_input_name (790-797)

include/fluent-bit/flb_input_chunk.h (1)

src/flb_input_chunk.c (3)

flb_input_chunk_write_header_v2 (1074-1151)

flb_input_chunk_has_direct_routes (1153-1176)

flb_input_chunk_get_direct_routes (1178-1235)

src/flb_input_chunk.c (2)

src/flb_routes_mask.c (2)

flb_routes_mask_set_bit (76-91)

flb_routes_mask_set_by_tag (31-67)

include/fluent-bit/flb_mem.h (2)

flb_free (126-128)

flb_calloc (84-96)

tests/internal/input_chunk_routes.c (2)

include/fluent-bit/flb_mem.h (1)

flb_free (126-128)

src/flb_input_chunk.c (4)

flb_input_chunk_write_header_v2 (1074-1151)

flb_input_chunk_has_direct_routes (1153-1176)

flb_input_chunk_get_direct_routes (1178-1235)

flb_input_chunk_get_tag (2519-2536)

🪛 markdownlint-cli2 (0.18.1)

CHUNKS.md

100-100: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (30)

GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit (Arm64), amd64_arm64, -DCMAKE_SYSTEM_NAME=Windows -DCMA...
GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit, x64, x64-windows-static, 3.31.6)
GitHub Check: pr-windows-build / call-build-windows-package (Windows 32bit, x86, x86-windows-static, 3.31.6)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_ARROW=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_COVERAGE=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, gcc, g++)
GitHub Check: pr-compile-without-cxx (3.31.6)
GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, clang, clang++)
GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-22.04, clang-12)
GitHub Check: PR - fuzzing test
GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-22.04, clang-12)
GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-24.04, clang-14)
GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-24.04, clang-14)
GitHub Check: pr-compile-centos-7

🔇 Additional comments (1)

lib/chunkio/src/cio_file.c (1)

1109-1109: Good catch on preserving the full content region.

Switching the relocation to copy cf->data_size bytes fixes the data truncation that happened when shrinking metadata layouts. Thanks for tightening this up.

src/flb_input_chunk.c

coderabbitai · 2025-11-02T19:10:46Z

src/flb_task.c

+    if (flb_input_chunk_has_direct_routes(task_ic) == FLB_TRUE) {
+        stored_routes_result = flb_input_chunk_get_direct_routes(task_ic,
+                                                                 &direct_output_ids,
+                                                                 &direct_output_count);
+        if (stored_routes_result == 0 && direct_output_count > 0) {
+            stored_routes_valid = FLB_TRUE;
+            missing_output_id = 0;
+            for (direct_output_index = 0;
+                 direct_output_index < direct_output_count;
+                 direct_output_index++) {
+                stored_output = task_find_output_by_id(config,
+                                                       direct_output_ids[direct_output_index]);
+                if (!stored_output) {
+                    stored_routes_valid = FLB_FALSE;
+                    missing_output_id = direct_output_ids[direct_output_index];
+                    break;
+                }
+            }
+
+            if (stored_routes_valid == FLB_TRUE) {
+                direct_count = 0;
+                stored_routes_alloc_failed = FLB_FALSE;
+                for (direct_output_index = 0;
+                     direct_output_index < direct_output_count;
+                     direct_output_index++) {
+                    stored_output = task_find_output_by_id(config,
+                                                           direct_output_ids[direct_output_index]);
+                    if (!stored_output) {
+                        continue;
+                    }
+
+                    route = flb_calloc(1, sizeof(struct flb_task_route));
+                    if (!route) {
+                        flb_errno();
+                        stored_routes_alloc_failed = FLB_TRUE;
+                        break;
+                    }
+
+                    route->status = FLB_TASK_ROUTE_INACTIVE;
+                    route->out = stored_output;
+                    mk_list_add(&route->_head, &task->routes);
+                    direct_count++;
+                }
+
+                if (stored_routes_alloc_failed == FLB_TRUE) {
+                    if (router_context_initialized) {
+                        flb_router_chunk_context_destroy(&router_context);
+                        router_context_initialized = FLB_FALSE;
+                    }
+                    if (direct_output_ids) {
+                        flb_free(direct_output_ids);
+                    }
+                    task->event_chunk->data = NULL;
+                    flb_task_destroy(task, FLB_TRUE);
+                    return NULL;
+                }
+
+                if (direct_count > 0) {
+                    stored_routes_used = FLB_TRUE;
+                }
+            }
+            else {
+                flb_warn("[task] input=%s/%s stored direct route id=%u not found for chunk %s, falling back to configured routes",
+                         i_ins->p->name,
+                         flb_input_name(i_ins),
+                         missing_output_id,
+                         flb_input_chunk_get_name(task_ic));
+            }
+        }
+        else if (stored_routes_result == -2) {
+            flb_warn("[task] input=%s/%s invalid stored direct routing metadata for chunk %s, falling back to configured routes",
+                     i_ins->p->name,
+                     flb_input_name(i_ins),
+                     flb_input_chunk_get_name(task_ic));
+        }
+    }
+
+    if (stored_routes_used == FLB_TRUE) {


⚠️ Potential issue | 🔴 Critical

Stored direct route IDs need a stable key

Restoring the routes solely by matching the persisted uint8_t to o_ins->id assumes that output ids never change. In practice those ids are assigned sequentially when outputs are instantiated, so adding/removing/reordering outputs (a common config change between restarts) will shift the numbering. After such a change a backlog chunk will happily bind to whatever new output now owns the same numeric id, sending data to the wrong destination with no warning. We need to persist and validate a stable identifier (e.g., alias/name + plugin type hash) alongside the id, and refuse/directly fall back to normal routing when it no longer matches. Please extend the metadata format and the restore path accordingly (and cover the equivalent restoration in flb_input_chunk_map() too).

🤖 Prompt for AI Agents

In src/flb_task.c around lines 459-536, the current restore logic matches stored direct routes only by persisted uint8_t output id which can change; update the metadata to persist a stable key (e.g., output alias/name plus plugin-type hash) alongside the numeric id, and change the restore loop to read both id and stable key, locate the output by verifying the stable key matches the candidate output (not just id), and if the stable key does not match treat it as a missing route (log a warning and fall back to configured routes). Ensure backward compatibility by continuing to accept legacy entries that only include numeric id (but prefer failing safe if validation cannot be performed), update the code paths that write route metadata to include the new stable key, mirror the same validation logic in flb_input_chunk_map(), free any allocated memory on errors, destroy router_context as before, and add a concise warning message when a stored route is rejected due to stable-id mismatch so operators can detect config drift.

edsiper · 2025-11-02T21:15:48Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2025-11-02T21:20:22Z

src/flb_task.c

+static struct flb_output_instance *task_find_output_by_id(struct flb_config *config,
+                                                          uint8_t id)
+{
+    struct mk_list *head;
+    struct flb_output_instance *o_ins;
+
+    if (!config) {
+        return NULL;
+    }
+
+    mk_list_foreach(head, &config->outputs) {
+        o_ins = mk_list_entry(head, struct flb_output_instance, _head);
+        if (o_ins->id == id) {
+            return o_ins;
+        }


Restore outputs using 8-bit IDs only

Direct-route metadata stores output IDs as 16-bit values (uint16_t), but the helper used during task recovery accepts a uint8_t. Any output whose ID exceeds 255 will be truncated when task_find_output_by_id searches for the persisted route, so chunks with valid stored routes will fail to bind to their original outputs and will fall back to tag routing (or be dropped) after restart. The lookup function should take at least a 16-bit or native int to match the stored metadata.

Useful? React with 👍 / 👎.

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (3)

src/flb_input_chunk.c (1)

919-994: ** Critical: Direct route restoration lacks stable output validation.**

Lines 927-950 validate stored routes only by numeric output ID. Since IDs are reassigned sequentially on restart, config changes (reordering/replacing outputs) will cause backlog chunks to route to different destinations. The code must persist and validate a stable identifier (output alias/name + plugin type) alongside the numeric ID, rejecting routes when the stable identifier no longer matches.

Extend the routing metadata to include output stable identifiers, update the parse/validation logic to check them, and fall back to tag routing with a warning when validation fails. This mirrors the same issue in src/flb_task.c (Lines 459-536).
src/flb_task.c (2)
74-92: ** Critical: Output ID type mismatch causes truncation.**

The helper accepts uint8_t id (Line 75) but direct-route metadata stores 16-bit output IDs. Any output with ID ≥ 256 will be truncated during lookup, causing chunks to fail restoration or bind to the wrong output. Change the parameter to at least uint16_t or int.

Apply this diff:
-static struct flb_output_instance *task_find_output_by_id(struct flb_config *config,
-                                                          uint8_t id)
+static struct flb_output_instance *task_find_output_by_id(struct flb_config *config,
+                                                          uint16_t id)
 {
     struct mk_list *head;
     struct flb_output_instance *o_ins;
459-536: ** Critical: Stored routes lack stable output identifiers.**

The restoration logic (Lines 469-476) validates outputs solely by numeric ID, which are reassigned sequentially each restart. Config changes that reorder/swap outputs will cause stored chunks to route to different destinations silently. The metadata must include a stable key (e.g., output alias/name + plugin type) and this code must verify it matches before accepting the stored route, falling back to tag routing on mismatch.

Extend the metadata format to persist output alias/name alongside IDs, update flb_input_chunk_get_direct_routes to return stable identifiers, and add validation here that compares them. Log a warning when stored routes are rejected due to identifier mismatch.

🧹 Nitpick comments (1)

CHUNKS.md (1)
100-117: Specify language for fenced code block.

The fenced code block should specify a language identifier for proper syntax highlighting.

Based on static analysis.

Apply this diff:
-```
+```text
 Metadata
                 --   +---------+-------+

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a8b7f5d and c9b7465.

📒 Files selected for processing (8)

CHUNKS.md (1 hunks)
include/fluent-bit/flb_input_chunk.h (3 hunks)
lib/chunkio/src/cio_file.c (1 hunks)
src/flb_input_chunk.c (8 hunks)
src/flb_input_log.c (2 hunks)
src/flb_task.c (3 hunks)
tests/internal/CMakeLists.txt (1 hunks)
tests/internal/input_chunk_routes.c (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (2)

include/fluent-bit/flb_input_chunk.h
tests/internal/CMakeLists.txt

🧰 Additional context used

🧬 Code graph analysis (4)

src/flb_input_log.c (3)

src/flb_routes_mask.c (1)

flb_routes_mask_get_bit (125-140)

src/flb_input_chunk.c (2)

flb_input_chunk_write_header_v2 (1075-1162)

flb_input_chunk_get_name (2485-2491)

include/fluent-bit/flb_mem.h (1)

flb_free (126-128)

src/flb_task.c (4)

src/flb_input_chunk.c (3)

flb_input_chunk_has_direct_routes (1164-1187)

flb_input_chunk_get_direct_routes (1189-1251)

flb_input_chunk_get_name (2485-2491)

include/fluent-bit/flb_mem.h (2)

flb_calloc (84-96)

flb_free (126-128)

src/flb_router_condition.c (1)

flb_router_chunk_context_destroy (187-190)

src/flb_input.c (1)

flb_input_name (790-797)

tests/internal/input_chunk_routes.c (2)

include/fluent-bit/flb_mem.h (1)

flb_free (126-128)

src/flb_input_chunk.c (4)

flb_input_chunk_write_header_v2 (1075-1162)

flb_input_chunk_has_direct_routes (1164-1187)

flb_input_chunk_get_direct_routes (1189-1251)

flb_input_chunk_get_tag (2535-2552)

src/flb_input_chunk.c (2)

src/flb_routes_mask.c (2)

flb_routes_mask_set_bit (76-91)

flb_routes_mask_set_by_tag (31-67)

include/fluent-bit/flb_mem.h (2)

flb_free (126-128)

flb_calloc (84-96)

🪛 GitHub Actions: Run unit tests

lib/chunkio/src/cio_file.c

[error] 1-1: Unit test suite reported failures: 1 of 169 tests failed (flb-it-input_chunk_routes).

tests/internal/input_chunk_routes.c

[error] 109-109: Test 'flb-it-input_chunk_routes' failed. Test chunk_metadata_direct_routes... Check ret == CIO_OK... failed.

🪛 markdownlint-cli2 (0.18.1)

CHUNKS.md

100-100: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)

GitHub Check: run-windows-unit-tests / call-build-windows-unit-test (Windows 64bit, x64, x64-windows-static, 3.31.6, windows-latest)
GitHub Check: run-windows-unit-tests / call-build-windows-unit-test (Windows 32bit, x86, x86-windows-static, 3.31.6, windows-latest)
GitHub Check: run-windows-unit-tests / call-build-windows-unit-test (Windows 64bit (Arm64), amd64_arm64, -DCMAKE_SYSTEM_NAME=Windows -DC...

🔇 Additional comments (5)

lib/chunkio/src/cio_file.c (1)

1109-1109: LGTM: Correct data size used for content relocation.

The change to use cf->data_size instead of size when moving content is correct. When metadata space is sufficient (meta_av >= size), we need to relocate the existing content data (whose size is cf->data_size), not the new metadata size.

src/flb_input_log.c (1)

205-270: Direct routing persistence logic is well-structured.

The implementation correctly:

Pre-counts eligible routes before allocation

Allocates appropriately-sized buffer for output IDs

Casts output IDs to uint16_t matching the stored format (Line 248)

Calls flb_input_chunk_write_header_v2 to persist metadata

Frees allocated memory on all paths

Logs warnings on write failure without blocking the operation

src/flb_input_chunk.c (2)

65-171: Metadata view implementation is well-structured.

The new flb_input_chunk_meta_view structure and input_chunk_metadata_view function provide a clean abstraction for parsing chunk metadata. The code correctly:

Validates magic bytes

Extracts tag with proper null-terminator handling

Parses routing payload when flags indicate direct routes

Handles legacy format gracefully (flags = 0)

1075-1251: Direct routing API functions are correctly implemented.

The new API functions:

flb_input_chunk_write_header_v2: Properly encodes routing metadata with big-endian 16-bit IDs

flb_input_chunk_has_direct_routes: Validates flags and payload presence

flb_input_chunk_get_direct_routes: Decodes routing payload with proper bounds checking

Memory management, error handling, and boundary validation are all correct.

tests/internal/input_chunk_routes.c (1)

108-109: I've completed a thorough analysis of the code path for this test. Here's what I found:

Code Review Summary:
The test calls cio_chunk_up_force() on a freshly created chunk with proper initialization. The function chain is:

cio_chunk_up_force() → cio_file_up_force() → _cio_file_up(ch, CIO_FALSE)

_cio_file_up() opens the file with O_RDWR | O_CREAT, gets its size, and memory-maps it

For an empty file (size 0) with CIO_OPEN_RW flag (which is set), it calls cio_file_resize() to allocate space, then cio_file_native_map()

Findings:
The code logic appears sound for the normal case. The test setup correctly:

Deletes and recreates the test path

Creates a fresh context, stream, and chunk

Opens the chunk with proper flags

Potential Failure Points:
If cio_chunk_up_force() returns non-OK, it would be from:

cio_file_native_open() failure (file permissions or /tmp unavailable)

cio_file_resize() failure (disk space or permissions)

cio_file_native_map() failure (mmap syscall failure)

Limitation:
I cannot compile or execute the test in the sandbox environment to reproduce the failure. The issue appears to be CI environment-specific or caused by a recent code change affecting file operations.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

CHUNKS.md (1)
103-124: Add language identifier to fenced code block.

The ASCII diagram fenced code block should specify a language identifier (even if it's just text) to satisfy markdown linting rules and improve rendering consistency.

Apply this diff:
-```
+```text
 Metadata
                 --   +---------+-------+
As per coding guidelines (static analysis hint MD040).

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c9b7465 and d42742b.

📒 Files selected for processing (8)

CHUNKS.md (1 hunks)
include/fluent-bit/flb_input_chunk.h (3 hunks)
lib/chunkio/src/cio_file.c (1 hunks)
src/flb_input_chunk.c (8 hunks)
src/flb_input_log.c (2 hunks)
src/flb_task.c (4 hunks)
tests/internal/CMakeLists.txt (1 hunks)
tests/internal/input_chunk_routes.c (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (3)

tests/internal/input_chunk_routes.c
tests/internal/CMakeLists.txt
lib/chunkio/src/cio_file.c

🧰 Additional context used

🧬 Code graph analysis (4)

src/flb_task.c (4)

src/flb_input_chunk.c (4)

flb_input_chunk_has_direct_routes (1298-1321)

flb_input_chunk_get_direct_routes (1323-1471)

flb_input_chunk_destroy_direct_routes (1473-1491)

flb_input_chunk_get_name (2725-2731)

include/fluent-bit/flb_mem.h (1)

flb_calloc (84-96)

src/flb_router_condition.c (1)

flb_router_chunk_context_destroy (187-190)

src/flb_input.c (1)

flb_input_name (790-797)

include/fluent-bit/flb_input_chunk.h (1)

src/flb_input_chunk.c (4)

flb_input_chunk_write_header_v2 (1119-1296)

flb_input_chunk_has_direct_routes (1298-1321)

flb_input_chunk_get_direct_routes (1323-1471)

flb_input_chunk_destroy_direct_routes (1473-1491)

src/flb_input_log.c (3)

src/flb_routes_mask.c (1)

flb_routes_mask_get_bit (125-140)

include/fluent-bit/flb_mem.h (2)

flb_calloc (84-96)

flb_free (126-128)

src/flb_input_chunk.c (2)

flb_input_chunk_write_header_v2 (1119-1296)

flb_input_chunk_get_name (2725-2731)

src/flb_input_chunk.c (2)

src/flb_routes_mask.c (2)

flb_routes_mask_set_bit (76-91)

flb_routes_mask_set_by_tag (31-67)

include/fluent-bit/flb_mem.h (2)

flb_calloc (84-96)

flb_free (126-128)

🪛 markdownlint-cli2 (0.18.1)

CHUNKS.md

103-103: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (30)

GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit (Arm64), amd64_arm64, -DCMAKE_SYSTEM_NAME=Windows -DCMA...
GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit, x64, x64-windows-static, 3.31.6)
GitHub Check: pr-windows-build / call-build-windows-package (Windows 32bit, x86, x86-windows-static, 3.31.6)
GitHub Check: PR - fuzzing test
GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_COVERAGE=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_ARROW=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, gcc, g++)
GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-24.04, clang-14)
GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-24.04, clang-14)
GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-22.04, clang-12)
GitHub Check: pr-compile-without-cxx (3.31.6)
GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-22.04, clang-12)
GitHub Check: pr-compile-centos-7

🔇 Additional comments (8)

src/flb_input_log.c (1)

207-289: Direct route persistence implementation looks solid.

The two-pass collection approach properly:

Counts routes before allocation to size the array correctly

Prefers output alias over name for stable labels

Bounds-checks label lengths against UINT16_MAX

Uses uint16_t for route IDs, avoiding the truncation issue from past reviews

Includes appropriate warning when persistence fails

Cleans up allocated memory in all paths

include/fluent-bit/flb_input_chunk.h (1)

28-135: LGTM! Clean API surface for direct route handling.

The new declarations properly:

Use uint16_t for route IDs (avoiding truncation)

Include stable textual labels alongside numeric IDs

Provide forward declaration for struct cio_chunk

Define consistent flag bits for metadata

Expose complete lifecycle management (write, check, get, destroy)

src/flb_task.c (2)

75-132: Excellent stable output resolution strategy.

The helper properly resolves outputs using a three-tier strategy:

Match by alias (if label present)

Match by name (if label present)

Fallback to numeric ID

This approach ensures chunks continue routing to the intended destination even when output ordering changes between restarts, addressing the concerns from previous reviews about output ID instability.

503-596: Robust direct route restoration with proper validation and fallback.

The restoration logic appropriately:

Pre-validates all stored routes before creating any task routes (lines 510-522)

Uses atomic semantics: either all routes resolve successfully or none are used

Falls back gracefully to configured routes when any stored route is missing (lines 567-582)

Provides actionable warning messages including both numeric ID and textual label (lines 568-574)

Cleans up allocated resources in all code paths (lines 554-560, 586-600)

Returns early after successful restoration to avoid duplicate route creation (lines 585-596)

This addresses the previous review concerns about validating stable identifiers and handling config drift.

src/flb_input_chunk.c (4)

65-171: Well-structured metadata view abstraction.

The new flb_input_chunk_meta_view structure and parsing function centralize metadata extraction logic, properly:

Distinguishing legacy format (no magic bytes) from extended format

Extracting tag with null-terminator detection (lines 136-139)

Parsing routing payload when direct-routes flag is set (lines 141-168)

Returning distinct codes for missing magic vs. incomplete routing data (0 vs. 1)

Avoiding buffer overruns with careful offset arithmetic (lines 154-164)

This provides a clean abstraction for callers like flb_input_chunk_get_tag.

1119-1296: Solid metadata writer with proper encoding and bounds checking.

The v2 header writer correctly:

Pre-computes label lengths and detects overflow (lines 1171-1195)

Validates total routing payload fits in uint16_t (lines 1204-1207)

Truncates tag if necessary to fit in available metadata space (lines 1210-1216)

Encodes all multi-byte fields in big-endian (lines 1252-1253, 1256-1257, 1261-1262, 1270-1271)

Sets flags byte appropriately based on label presence (lines 1242-1246)

Includes null terminator after tag (line 1249)

Properly cleans up on all error paths (lines 1222-1223, 1287-1289)

1323-1471: Thorough bounds checking in direct route reader.

The reader includes comprehensive validation:

Checks minimum routing payload size before proceeding (lines 1364-1370, 1380-1382)

Validates label section size when labels flag set (lines 1386-1392)

Checks each label length against remaining bytes (lines 1441-1445)

Allocates label buffers with space for null terminator (line 1447)

Null-terminates all labels (line 1458)

Returns distinct error codes: -1 for allocation failure, -2 for corrupt data

Cleans up partial results on any error (lines 1413, 1429, 1443, 1451)

This defensive parsing prevents buffer overruns when reading potentially corrupt or malicious metadata.

956-1038: Proper direct route integration with fallback to tag routing.

The loading logic in flb_input_chunk_map appropriately:

Attempts to load direct routes from metadata (line 961)

Pre-validates all routes resolve to existing outputs (lines 963-975)

Uses the same stable lookup strategy as task creation (alias → name → id)

Clears routes_mask and populates it atomically only when all routes valid (lines 977-992)

Warns with actionable details when any route missing (lines 995-1000, 1004-1006)

Falls back to tag-based routing when direct routes invalid or produce no routes (lines 1022-1038)

Cleans up allocated route structures in all paths (lines 1009-1012)

This ensures backlog chunks always route somewhere, even when config has changed.

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (1)

CHUNKS.md (1)
107-149: Add language identifier to the fenced diagram.

Please mark the code fence with a language (e.g., text) for proper rendering.

Apply:
-```
+```text
 ...
-```
+```

🧹 Nitpick comments (5)

CHUNKS.md (1)
68-71: Docs contradict new flags; update v1.9 paragraph.

This still says “the fourth byte is unused,” but below you introduce feature flags in that byte. Update this paragraph to reflect the flags usage and legacy behavior when the byte is zero.
- specifies the type of records the Chunk is storing, for Logs this value is 0x0 and for Metrics is 0x1. The four byte is unused for now.
+ specifies the type of records the Chunk is storing (e.g., Logs=0x0, Metrics=0x1). The fourth byte carries feature flags (0 when legacy layout is used).
src/flb_task.c (2)
76-209: Deduplicate output-resolution helper.

task_collect_output_references duplicates input_chunk_collect_output_references with near-identical logic. Extract a shared utility (e.g., in flb_router or flb_output) and call it from both sites to avoid drift.

684-691: Include plugin in warning for missing stored route (operator visibility).

Add stored plugin name (when present) to the warning to make config drift obvious.
-                flb_warn("[task] input=%s/%s stored direct route id=%u label=%.*s not found for chunk %s, falling back to configured routes",
+                flb_warn("[task] input=%s/%s stored direct route id=%u label=%.*s%s%s not found for chunk %s, falling back to configured routes",
                          i_ins->p->name,
                          flb_input_name(i_ins),
                          (unsigned int) missing_output_id,
                          (int) missing_output_label_length,
-                         missing_output_label ? missing_output_label : "",
+                         missing_output_label ? missing_output_label : "",
+                         direct_routes[direct_output_index].plugin_name ? " plugin=" : "",
+                         direct_routes[direct_output_index].plugin_name ? direct_routes[direct_output_index].plugin_name : "",
                          flb_input_chunk_get_name(task_ic));
tests/internal/input_chunk_routes.c (1)

368-597: Add a negative test for plugin-type mismatch.

Create a chunk with label “shared”, plugin “stdout”; then configure an output “shared” with plugin “elasticsearch” and assert it’s not matched (routes_mask bit remains 0). This guards the plugin-type validation path.

src/flb_input_chunk.c (1)

756-889: Unify output-resolution logic with task side.

input_chunk_collect_output_references and task_collect_output_references have duplicated code. Factor into a single shared helper to ensure identical behavior and easier maintenance.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a1142c8 and dd92c0f.

📒 Files selected for processing (7)

CHUNKS.md (2 hunks)
include/fluent-bit/flb_input_chunk.h (3 hunks)
src/flb_input_chunk.c (7 hunks)
src/flb_input_log.c (2 hunks)
src/flb_task.c (6 hunks)
tests/internal/CMakeLists.txt (1 hunks)
tests/internal/input_chunk_routes.c (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

tests/internal/CMakeLists.txt

🧰 Additional context used

🧠 Learnings (8)

📚 Learning: 2025-08-29T06:25:02.561Z

Learnt from: shadowshot-x
Repo: fluent/fluent-bit PR: 10794
File: tests/internal/aws_compress.c:7-7
Timestamp: 2025-08-29T06:25:02.561Z
Learning: In Fluent Bit, ZSTD (zstandard) compression library is bundled directly in the source tree at `lib/zstd-1.5.7` and is built unconditionally as a static library. Unlike optional external dependencies, ZSTD does not use conditional compilation guards like `FLB_HAVE_ZSTD` and is always available. Headers like `<fluent-bit/flb_zstd.h>` can be included directly without guards.

Applied to files:

src/flb_task.c

📚 Learning: 2025-08-31T12:46:11.940Z

Learnt from: ThomasDevoogdt
Repo: fluent/fluent-bit PR: 9277
File: .github/workflows/pr-compile-check.yaml:147-151
Timestamp: 2025-08-31T12:46:11.940Z
Learning: In fluent-bit CMakeLists.txt, the system library preference flags are defined as FLB_PREFER_SYSTEM_LIB_ZSTD and FLB_PREFER_SYSTEM_LIB_KAFKA with the FLB_ prefix.

Applied to files:

src/flb_task.c

📚 Learning: 2025-08-29T06:25:27.250Z

Learnt from: shadowshot-x
Repo: fluent/fluent-bit PR: 10794
File: tests/internal/aws_compress.c:93-107
Timestamp: 2025-08-29T06:25:27.250Z
Learning: In Fluent Bit, ZSTD compression is enabled by default and is treated as a core dependency, not requiring conditional compilation guards like `#ifdef FLB_HAVE_ZSTD`. Unlike some other optional components such as ARROW/PARQUET (which use `#ifdef FLB_HAVE_ARROW` guards), ZSTD support is always available and doesn't need build-time conditionals. ZSTD headers are included directly without guards across multiple plugins and core components.

Applied to files:

src/flb_task.c

📚 Learning: 2025-08-29T06:24:26.170Z

Learnt from: shadowshot-x
Repo: fluent/fluent-bit PR: 10794
File: tests/internal/aws_compress.c:39-42
Timestamp: 2025-08-29T06:24:26.170Z
Learning: In Fluent Bit, ZSTD compression support is enabled by default and does not require conditional compilation guards (like #ifdef FLB_HAVE_ZSTD) around ZSTD-related code declarations and implementations.

Applied to files:

src/flb_task.c

📚 Learning: 2025-08-29T06:24:44.797Z

Learnt from: shadowshot-x
Repo: fluent/fluent-bit PR: 10794
File: src/aws/flb_aws_compress.c:26-26
Timestamp: 2025-08-29T06:24:44.797Z
Learning: In Fluent Bit, ZSTD support is always available and enabled by default. The build system automatically detects and uses either the system libzstd library or builds the bundled ZSTD version. Unlike other optional dependencies like Arrow which use conditional compilation guards (e.g., FLB_HAVE_ARROW), ZSTD does not require conditional includes or build flags.

Applied to files:

src/flb_task.c

📚 Learning: 2025-08-29T06:24:55.855Z

Learnt from: shadowshot-x
Repo: fluent/fluent-bit PR: 10794
File: src/aws/flb_aws_compress.c:52-56
Timestamp: 2025-08-29T06:24:55.855Z
Learning: ZSTD compression is always available in Fluent Bit and does not require conditional compilation guards. Unlike Arrow/Parquet which use #ifdef FLB_HAVE_ARROW guards, ZSTD is built unconditionally with flb_zstd.c included directly in src/CMakeLists.txt and a bundled ZSTD library at lib/zstd-1.5.7/.

Applied to files:

src/flb_task.c

📚 Learning: 2025-08-29T06:25:27.250Z

Learnt from: shadowshot-x
Repo: fluent/fluent-bit PR: 10794
File: tests/internal/aws_compress.c:93-107
Timestamp: 2025-08-29T06:25:27.250Z
Learning: In Fluent Bit, ZSTD compression is enabled by default and is treated as a core dependency, not requiring conditional compilation guards like `#ifdef FLB_HAVE_ZSTD`. Unlike some other optional components, ZSTD support is always available and doesn't need build-time conditionals.

Applied to files:

src/flb_task.c

📚 Learning: 2025-10-06T17:39:21.606Z

Learnt from: edsiper
Repo: fluent/fluent-bit PR: 10967
File: plugins/in_elasticsearch/in_elasticsearch_bulk_prot.c:266-272
Timestamp: 2025-10-06T17:39:21.606Z
Learning: In the in_elasticsearch plugin, payloads are JSON-based, so tag_key values will always be strings. There's no need to handle binary (FLB_RA_BINARY) values when extracting tags from records in this plugin.

Applied to files:

CHUNKS.md

🧬 Code graph analysis (5)

tests/internal/input_chunk_routes.c (8)

src/flb_env.c (2)

flb_env_create (70-93)

flb_env_destroy (95-99)

src/flb_routes_mask.c (3)

flb_routes_mask_set_size (173-186)

flb_routes_mask_get_bit (125-140)

flb_routes_empty_mask_destroy (164-171)

src/flb_hash_table.c (2)

flb_hash_table_create (99-137)

flb_hash_table_destroy (197-215)

src/flb_input_chunk.c (7)

flb_input_chunk_write_header_v2 (1278-1573)

flb_input_chunk_has_direct_routes (1575-1598)

flb_input_chunk_get_direct_routes (1600-1905)

flb_input_chunk_destroy_direct_routes (1907-1928)

flb_input_chunk_get_tag (3212-3229)

flb_input_chunk_map (892-1204)

flb_input_chunk_destroy (2122-2224)

src/flb_input.c (2)

flb_input_instance_init (1091-1429)

flb_input_instance_exit (1520-1536)

src/flb_kv.c (1)

flb_kv_release (123-133)

src/flb_metrics.c (1)

flb_metrics_destroy (193-211)

src/flb_config_map.c (1)

flb_config_map_destroy (377-409)

src/flb_input_log.c (3)

src/flb_routes_mask.c (1)

flb_routes_mask_get_bit (125-140)

include/fluent-bit/flb_mem.h (2)

flb_calloc (84-96)

flb_free (126-128)

src/flb_input_chunk.c (2)

flb_input_chunk_write_header_v2 (1278-1573)

flb_input_chunk_get_name (3162-3168)

src/flb_task.c (4)

src/flb_input_chunk.c (5)

flb_chunk_route_plugin_matches (177-210)

flb_input_chunk_has_direct_routes (1575-1598)

flb_input_chunk_get_direct_routes (1600-1905)

flb_input_chunk_get_name (3162-3168)

flb_input_chunk_destroy_direct_routes (1907-1928)

include/fluent-bit/flb_mem.h (2)

flb_calloc (84-96)

flb_free (126-128)

src/flb_router_condition.c (1)

flb_router_chunk_context_destroy (187-190)

src/flb_input.c (1)

flb_input_name (790-797)

include/fluent-bit/flb_input_chunk.h (1)

src/flb_input_chunk.c (5)

flb_input_chunk_write_header_v2 (1278-1573)

flb_chunk_route_plugin_matches (177-210)

flb_input_chunk_has_direct_routes (1575-1598)

flb_input_chunk_get_direct_routes (1600-1905)

flb_input_chunk_destroy_direct_routes (1907-1928)

src/flb_input_chunk.c (2)

include/fluent-bit/flb_mem.h (2)

flb_calloc (84-96)

flb_free (126-128)

src/flb_routes_mask.c (2)

flb_routes_mask_set_bit (76-91)

flb_routes_mask_set_by_tag (31-67)

🪛 markdownlint-cli2 (0.18.1)

CHUNKS.md

107-107: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (30)

GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit (Arm64), amd64_arm64, -DCMAKE_SYSTEM_NAME=Windows -DCMA...
GitHub Check: pr-windows-build / call-build-windows-package (Windows 32bit, x86, x86-windows-static, 3.31.6)
GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit, x64, x64-windows-static, 3.31.6)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_ARROW=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_COVERAGE=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, clang, clang++)
GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-22.04, clang-12)
GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-24.04, clang-14)
GitHub Check: pr-compile-centos-7
GitHub Check: pr-compile-without-cxx (3.31.6)
GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-22.04, clang-12)
GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-24.04, clang-14)
GitHub Check: PR - fuzzing test

🔇 Additional comments (5)

src/flb_input_log.c (1)

210-315: Direct-route persistence looks correct.

Counting, allocation, label/plugin capture, wide IDs, and header write are all handled with bounds checks and proper cleanup. LGTM.

src/flb_task.c (1)

579-713: Restore logic is safe and precise.

You validate labels first, plugin type via flb_chunk_route_plugin_matches, and only use numeric IDs when no label is stored. Early exit and cleanup paths look correct. LGTM.

src/flb_input_chunk.c (2)

92-171: Metadata view parsing is robust.

Magic check, tag terminator handling, bounds checks, and routing slice setup are correct, including partial/legacy cases. LGTM.

1278-1573: Header v2 writer looks solid.

Wide ID handling, payload sizing, tag truncation against routing, BE encoding, alias-flag bit packing, and error cleanup are correct. LGTM.

include/fluent-bit/flb_input_chunk.h (1)

48-66: Public API and flags look consistent.

Flags, structures, and prototypes align with the implementation and tests.

Also applies to: 136-148

Signed-off-by: Eduardo Silva <[email protected]>

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (3)

tests/internal/input_chunk_routes.c (3)
21-23: Consider using temporary directory APIs for better portability.

The hardcoded /tmp paths work for most Unix systems but could cause issues on systems where /tmp is not available or when running tests in parallel. Consider using platform-specific temporary directory functions or environment variables.

220-367: Consider adopting the goto cleanup pattern for consistency.

This test uses inline cleanup and early returns, while test_chunk_restore_alias_plugin_match_multiple and test_chunk_restore_alias_plugin_null_matches_all use a goto cleanup pattern. Using a consistent approach across all tests improves maintainability and reduces the risk of resource leaks if the test framework behavior changes.

68-68: Consider defining named constants for test parameters.

The magic numbers (64 for route mask size, 512 for hash table size, 1024 for chunk size) used throughout the tests would benefit from named constants that explain their purpose and make future adjustments easier.

For example:
#define TEST_ROUTE_MASK_SIZE 64
#define TEST_HASH_TABLE_SIZE 512
#define TEST_CHUNK_SIZE 1024
Also applies to: 98-98

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between dd92c0f and aced5ff.

📒 Files selected for processing (1)

tests/internal/input_chunk_routes.c (1 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

tests/internal/input_chunk_routes.c (8)

src/flb_env.c (2)

flb_env_create (70-93)

flb_env_destroy (95-99)

src/flb_routes_mask.c (3)

flb_routes_mask_set_size (173-186)

flb_routes_mask_get_bit (125-140)

flb_routes_empty_mask_destroy (164-171)

src/flb_hash_table.c (2)

flb_hash_table_create (99-137)

flb_hash_table_destroy (197-215)

src/flb_input_chunk.c (7)

flb_input_chunk_write_header_v2 (1278-1573)

flb_input_chunk_has_direct_routes (1575-1598)

flb_input_chunk_get_direct_routes (1600-1905)

flb_input_chunk_destroy_direct_routes (1907-1928)

flb_input_chunk_get_tag (3212-3229)

flb_input_chunk_map (892-1204)

flb_input_chunk_destroy (2122-2224)

src/flb_input.c (2)

flb_input_instance_init (1091-1429)

flb_input_instance_exit (1520-1536)

src/flb_kv.c (1)

flb_kv_release (123-133)

src/flb_metrics.c (1)

flb_metrics_destroy (193-211)

src/flb_config_map.c (1)

flb_config_map_destroy (377-409)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (30)

GitHub Check: pr-windows-build / call-build-windows-package (Windows 32bit, x86, x86-windows-static, 3.31.6)
GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit (Arm64), amd64_arm64, -DCMAKE_SYSTEM_NAME=Windows -DCMA...
GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit, x64, x64-windows-static, 3.31.6)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_ARROW=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_COVERAGE=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, clang, clang++)
GitHub Check: pr-compile-without-cxx (3.31.6)
GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-22.04, clang-12)
GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-24.04, clang-14)
GitHub Check: PR - fuzzing test
GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-24.04, clang-14)
GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-22.04, clang-12)
GitHub Check: pr-compile-centos-7

🔇 Additional comments (6)

tests/internal/input_chunk_routes.c (6)

25-50: LGTM!

The msgpack payload construction is clean and properly manages resources with msgpack_sbuffer_destroy.

52-131: LGTM!

The initialization logic properly handles the complex setup with cascading error cleanup for the four hash tables. The error path for flb_routes_mask_set_size correctly cleans up the environment.

133-172: LGTM!

The add/cleanup pair properly manages output instance allocation and cleanup, with correct error handling for the alias allocation.

174-218: LGTM!

The legacy metadata writing function correctly handles all event types and properly manages the temporary buffer allocation and cleanup.

839-844: LGTM!

The test registration correctly includes all three test functions with proper null termination.

479-480: ****

The code is correct as written. cmt_initialize() initializes platform-level atomic operation primitives (pthread mutex or critical sections) as one-time global state and requires no corresponding cleanup function. The function returns early if already initialized on generic implementations, and on GCC/Clang/MSVC 64-bit it's a no-op. Both test functions properly clean up their metric instances with cmt_destroy(in.cmt) in the cleanup section, which is the correct and only required cleanup for the cmetrics API.

Likely an incorrect or invalid review comment.

tests/internal/input_chunk_routes.c

Signed-off-by: Eduardo Silva <[email protected]>

edsiper requested review from fujimotos and koleini as code owners November 2, 2025 18:51

github-actions bot added the docs-required label Nov 2, 2025

edsiper temporarily deployed to pr November 2, 2025 18:52 — with GitHub Actions Inactive

chatgpt-codex-connector bot reviewed Nov 2, 2025

View reviewed changes

edsiper temporarily deployed to pr November 2, 2025 19:10 — with GitHub Actions Inactive

edsiper had a problem deploying to pr November 2, 2025 19:10 — with GitHub Actions Failure

coderabbitai bot reviewed Nov 2, 2025

View reviewed changes

edsiper force-pushed the storage-metadata-for-routing branch from a8b7f5d to c9b7465 Compare November 2, 2025 20:51

edsiper temporarily deployed to pr November 2, 2025 20:51 — with GitHub Actions Inactive

edsiper temporarily deployed to pr November 2, 2025 21:10 — with GitHub Actions Inactive

edsiper had a problem deploying to pr November 2, 2025 21:10 — with GitHub Actions Failure

chatgpt-codex-connector bot reviewed Nov 2, 2025

View reviewed changes

coderabbitai bot reviewed Nov 2, 2025

View reviewed changes

edsiper force-pushed the storage-metadata-for-routing branch from c9b7465 to d42742b Compare November 2, 2025 22:21

edsiper temporarily deployed to pr November 2, 2025 22:21 — with GitHub Actions Inactive

coderabbitai bot reviewed Nov 2, 2025

View reviewed changes

edsiper temporarily deployed to pr November 2, 2025 22:41 — with GitHub Actions Inactive

edsiper force-pushed the storage-metadata-for-routing branch from a1142c8 to dd92c0f Compare November 4, 2025 00:24

edsiper temporarily deployed to pr November 4, 2025 00:24 — with GitHub Actions Inactive

coderabbitai bot reviewed Nov 4, 2025

View reviewed changes

edsiper temporarily deployed to pr November 4, 2025 00:43 — with GitHub Actions Inactive

edsiper had a problem deploying to pr November 4, 2025 00:43 — with GitHub Actions Failure

edsiper temporarily deployed to pr November 4, 2025 00:43 — with GitHub Actions Inactive

edsiper had a problem deploying to pr November 4, 2025 01:16 — with GitHub Actions Failure

tests: internal: input_chunk_route: add missing cmt init

aced5ff

Signed-off-by: Eduardo Silva <[email protected]>

edsiper temporarily deployed to pr November 4, 2025 01:59 — with GitHub Actions Inactive

coderabbitai bot reviewed Nov 4, 2025

View reviewed changes

tests/internal/input_chunk_routes.c Show resolved Hide resolved

edsiper temporarily deployed to pr November 4, 2025 02:18 — with GitHub Actions Inactive

tests: internal: input_chunk_route: simplify cleanup logic

4e101e9

Signed-off-by: Eduardo Silva <[email protected]>

edsiper temporarily deployed to pr November 4, 2025 03:17 — with GitHub Actions Inactive

edsiper merged commit fb24892 into master Nov 4, 2025
43 of 44 checks passed

edsiper deleted the storage-metadata-for-routing branch November 4, 2025 03:18

edsiper temporarily deployed to pr November 4, 2025 03:36 — with GitHub Actions Inactive

coderabbitai bot mentioned this pull request Nov 4, 2025

router: extend router interface to support metrics (v3) #11108

Open

chunks: extend metadata to support routing information (direct routes) #11097

chunks: extend metadata to support routing information (direct routes) #11097

Uh oh!

Conversation

edsiper commented Nov 2, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Nov 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Other AI code review bot(s) detected

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Pre-merge checks and finishing touches

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Nov 2, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Nov 2, 2025

Choose a reason for hiding this comment

Uh oh!

edsiper commented Nov 2, 2025

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Nov 2, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

edsiper commented Nov 2, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Nov 2, 2025 •

edited

Loading