-
Notifications
You must be signed in to change notification settings - Fork 1.8k
chunks: extend metadata to support routing information (direct routes) #11097
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Caution Review failedThe pull request is closed. Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. WalkthroughAdds persistent direct-route metadata to chunk headers, reader/writer APIs to store/parse routes (IDs, optional labels/plugins), task-time restoration that resolves outputs by id/label/alias/plugin, and multiple Chunk I/O robustness and test updates for mapping, metadata growth, and Windows behavior. Changes
Sequence Diagram(s)sequenceDiagram
participant Writer as Input Log Writer
participant ChunkFS as Chunk (fs-backed)
participant Task as flb_task_create
participant Outputs as Output Registry
Writer->>Writer: collect direct-route entries (id, label?, plugin?)
Writer->>ChunkFS: flb_input_chunk_write_header_v2(tag, routes[])
ChunkFS->>ChunkFS: persist flags + null-terminated tag + routing payload
Note right of Task: restart / map time
Task->>ChunkFS: flb_input_chunk_has_direct_routes?
alt direct routes present
ChunkFS-->>Task: yes
Task->>ChunkFS: flb_input_chunk_get_direct_routes -> routes[]
loop per route
Task->>Outputs: resolve by id or label/alias, check plugin match
Outputs-->>Task: output instance(s) or missing
end
Task->>Task: attach INACTIVE per-route entries for resolved outputs
else none/invalid
ChunkFS-->>Task: no/invalid
Task->>Task: fall back to tag-based routing
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes
Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches✅ Passed checks (2 passed)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| } | ||
|
|
||
| if (direct_index < direct_count) { | ||
| direct_ids[direct_index] = (uint8_t) route_path->ins->id; | ||
| direct_index++; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid truncating output IDs in direct-route metadata
When persisting direct routes the code stores each destination’s id in a uint8_t buffer (direct_ids[direct_index] = (uint8_t) route_path->ins->id). Output instance IDs are plain ints and are assigned sequentially without an 8‑bit limit, so deployments with 256+ outputs will wrap IDs when writing metadata. After a restart these truncated values can either fail to resolve or resolve to the wrong output, causing chunks to be routed incorrectly or silently fall back to default routing. Use a wider type (e.g., uint16_t or int) or validate that IDs fit in 8 bits before serializing.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (1)
CHUNKS.md (1)
100-117: Add a language hint to the new fenced block.markdownlint (MD040) flags this new fence; adding a language like
textkeeps the pipeline green without changing the rendering.-``` +```text ... -``` +```
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
CHUNKS.md(1 hunks)include/fluent-bit/flb_input_chunk.h(3 hunks)lib/chunkio/src/cio_file.c(1 hunks)src/flb_input_chunk.c(8 hunks)src/flb_input_log.c(2 hunks)src/flb_task.c(3 hunks)tests/internal/CMakeLists.txt(1 hunks)tests/internal/input_chunk_routes.c(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (5)
src/flb_input_log.c (3)
src/flb_routes_mask.c (1)
flb_routes_mask_get_bit(125-140)src/flb_input_chunk.c (2)
flb_input_chunk_write_header_v2(1074-1151)flb_input_chunk_get_name(2469-2475)include/fluent-bit/flb_mem.h (1)
flb_free(126-128)
src/flb_task.c (4)
src/flb_input_chunk.c (3)
flb_input_chunk_has_direct_routes(1153-1176)flb_input_chunk_get_direct_routes(1178-1235)flb_input_chunk_get_name(2469-2475)include/fluent-bit/flb_mem.h (2)
flb_calloc(84-96)flb_free(126-128)src/flb_router_condition.c (1)
flb_router_chunk_context_destroy(187-190)src/flb_input.c (1)
flb_input_name(790-797)
include/fluent-bit/flb_input_chunk.h (1)
src/flb_input_chunk.c (3)
flb_input_chunk_write_header_v2(1074-1151)flb_input_chunk_has_direct_routes(1153-1176)flb_input_chunk_get_direct_routes(1178-1235)
src/flb_input_chunk.c (2)
src/flb_routes_mask.c (2)
flb_routes_mask_set_bit(76-91)flb_routes_mask_set_by_tag(31-67)include/fluent-bit/flb_mem.h (2)
flb_free(126-128)flb_calloc(84-96)
tests/internal/input_chunk_routes.c (2)
include/fluent-bit/flb_mem.h (1)
flb_free(126-128)src/flb_input_chunk.c (4)
flb_input_chunk_write_header_v2(1074-1151)flb_input_chunk_has_direct_routes(1153-1176)flb_input_chunk_get_direct_routes(1178-1235)flb_input_chunk_get_tag(2519-2536)
🪛 markdownlint-cli2 (0.18.1)
CHUNKS.md
100-100: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (30)
- GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit (Arm64), amd64_arm64, -DCMAKE_SYSTEM_NAME=Windows -DCMA...
- GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit, x64, x64-windows-static, 3.31.6)
- GitHub Check: pr-windows-build / call-build-windows-package (Windows 32bit, x86, x86-windows-static, 3.31.6)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_ARROW=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_COVERAGE=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, gcc, g++)
- GitHub Check: pr-compile-without-cxx (3.31.6)
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, clang, clang++)
- GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-22.04, clang-12)
- GitHub Check: PR - fuzzing test
- GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-22.04, clang-12)
- GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-24.04, clang-14)
- GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-24.04, clang-14)
- GitHub Check: pr-compile-centos-7
🔇 Additional comments (1)
lib/chunkio/src/cio_file.c (1)
1109-1109: Good catch on preserving the full content region.Switching the relocation to copy
cf->data_sizebytes fixes the data truncation that happened when shrinking metadata layouts. Thanks for tightening this up.
| if (flb_input_chunk_has_direct_routes(task_ic) == FLB_TRUE) { | ||
| stored_routes_result = flb_input_chunk_get_direct_routes(task_ic, | ||
| &direct_output_ids, | ||
| &direct_output_count); | ||
| if (stored_routes_result == 0 && direct_output_count > 0) { | ||
| stored_routes_valid = FLB_TRUE; | ||
| missing_output_id = 0; | ||
| for (direct_output_index = 0; | ||
| direct_output_index < direct_output_count; | ||
| direct_output_index++) { | ||
| stored_output = task_find_output_by_id(config, | ||
| direct_output_ids[direct_output_index]); | ||
| if (!stored_output) { | ||
| stored_routes_valid = FLB_FALSE; | ||
| missing_output_id = direct_output_ids[direct_output_index]; | ||
| break; | ||
| } | ||
| } | ||
|
|
||
| if (stored_routes_valid == FLB_TRUE) { | ||
| direct_count = 0; | ||
| stored_routes_alloc_failed = FLB_FALSE; | ||
| for (direct_output_index = 0; | ||
| direct_output_index < direct_output_count; | ||
| direct_output_index++) { | ||
| stored_output = task_find_output_by_id(config, | ||
| direct_output_ids[direct_output_index]); | ||
| if (!stored_output) { | ||
| continue; | ||
| } | ||
|
|
||
| route = flb_calloc(1, sizeof(struct flb_task_route)); | ||
| if (!route) { | ||
| flb_errno(); | ||
| stored_routes_alloc_failed = FLB_TRUE; | ||
| break; | ||
| } | ||
|
|
||
| route->status = FLB_TASK_ROUTE_INACTIVE; | ||
| route->out = stored_output; | ||
| mk_list_add(&route->_head, &task->routes); | ||
| direct_count++; | ||
| } | ||
|
|
||
| if (stored_routes_alloc_failed == FLB_TRUE) { | ||
| if (router_context_initialized) { | ||
| flb_router_chunk_context_destroy(&router_context); | ||
| router_context_initialized = FLB_FALSE; | ||
| } | ||
| if (direct_output_ids) { | ||
| flb_free(direct_output_ids); | ||
| } | ||
| task->event_chunk->data = NULL; | ||
| flb_task_destroy(task, FLB_TRUE); | ||
| return NULL; | ||
| } | ||
|
|
||
| if (direct_count > 0) { | ||
| stored_routes_used = FLB_TRUE; | ||
| } | ||
| } | ||
| else { | ||
| flb_warn("[task] input=%s/%s stored direct route id=%u not found for chunk %s, falling back to configured routes", | ||
| i_ins->p->name, | ||
| flb_input_name(i_ins), | ||
| missing_output_id, | ||
| flb_input_chunk_get_name(task_ic)); | ||
| } | ||
| } | ||
| else if (stored_routes_result == -2) { | ||
| flb_warn("[task] input=%s/%s invalid stored direct routing metadata for chunk %s, falling back to configured routes", | ||
| i_ins->p->name, | ||
| flb_input_name(i_ins), | ||
| flb_input_chunk_get_name(task_ic)); | ||
| } | ||
| } | ||
|
|
||
| if (stored_routes_used == FLB_TRUE) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stored direct route IDs need a stable key
Restoring the routes solely by matching the persisted uint8_t to o_ins->id assumes that output ids never change. In practice those ids are assigned sequentially when outputs are instantiated, so adding/removing/reordering outputs (a common config change between restarts) will shift the numbering. After such a change a backlog chunk will happily bind to whatever new output now owns the same numeric id, sending data to the wrong destination with no warning. We need to persist and validate a stable identifier (e.g., alias/name + plugin type hash) alongside the id, and refuse/directly fall back to normal routing when it no longer matches. Please extend the metadata format and the restore path accordingly (and cover the equivalent restoration in flb_input_chunk_map() too).
🤖 Prompt for AI Agents
In src/flb_task.c around lines 459-536, the current restore logic matches stored
direct routes only by persisted uint8_t output id which can change; update the
metadata to persist a stable key (e.g., output alias/name plus plugin-type hash)
alongside the numeric id, and change the restore loop to read both id and stable
key, locate the output by verifying the stable key matches the candidate output
(not just id), and if the stable key does not match treat it as a missing route
(log a warning and fall back to configured routes). Ensure backward
compatibility by continuing to accept legacy entries that only include numeric
id (but prefer failing safe if validation cannot be performed), update the code
paths that write route metadata to include the new stable key, mirror the same
validation logic in flb_input_chunk_map(), free any allocated memory on errors,
destroy router_context as before, and add a concise warning message when a
stored route is rejected due to stable-id mismatch so operators can detect
config drift.
a8b7f5d to
c9b7465
Compare
|
@codex review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
src/flb_task.c
Outdated
| static struct flb_output_instance *task_find_output_by_id(struct flb_config *config, | ||
| uint8_t id) | ||
| { | ||
| struct mk_list *head; | ||
| struct flb_output_instance *o_ins; | ||
|
|
||
| if (!config) { | ||
| return NULL; | ||
| } | ||
|
|
||
| mk_list_foreach(head, &config->outputs) { | ||
| o_ins = mk_list_entry(head, struct flb_output_instance, _head); | ||
| if (o_ins->id == id) { | ||
| return o_ins; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Restore outputs using 8-bit IDs only
Direct-route metadata stores output IDs as 16-bit values (uint16_t), but the helper used during task recovery accepts a uint8_t. Any output whose ID exceeds 255 will be truncated when task_find_output_by_id searches for the persisted route, so chunks with valid stored routes will fail to bind to their original outputs and will fall back to tag routing (or be dropped) after restart. The lookup function should take at least a 16-bit or native int to match the stored metadata.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (3)
src/flb_input_chunk.c (1)
919-994: ** Critical: Direct route restoration lacks stable output validation.**Lines 927-950 validate stored routes only by numeric output ID. Since IDs are reassigned sequentially on restart, config changes (reordering/replacing outputs) will cause backlog chunks to route to different destinations. The code must persist and validate a stable identifier (output alias/name + plugin type) alongside the numeric ID, rejecting routes when the stable identifier no longer matches.
Extend the routing metadata to include output stable identifiers, update the parse/validation logic to check them, and fall back to tag routing with a warning when validation fails. This mirrors the same issue in
src/flb_task.c(Lines 459-536).src/flb_task.c (2)
74-92: ** Critical: Output ID type mismatch causes truncation.**The helper accepts
uint8_t id(Line 75) but direct-route metadata stores 16-bit output IDs. Any output with ID ≥ 256 will be truncated during lookup, causing chunks to fail restoration or bind to the wrong output. Change the parameter to at leastuint16_torint.Apply this diff:
-static struct flb_output_instance *task_find_output_by_id(struct flb_config *config, - uint8_t id) +static struct flb_output_instance *task_find_output_by_id(struct flb_config *config, + uint16_t id) { struct mk_list *head; struct flb_output_instance *o_ins;
459-536: ** Critical: Stored routes lack stable output identifiers.**The restoration logic (Lines 469-476) validates outputs solely by numeric ID, which are reassigned sequentially each restart. Config changes that reorder/swap outputs will cause stored chunks to route to different destinations silently. The metadata must include a stable key (e.g., output alias/name + plugin type) and this code must verify it matches before accepting the stored route, falling back to tag routing on mismatch.
Extend the metadata format to persist output alias/name alongside IDs, update
flb_input_chunk_get_direct_routesto return stable identifiers, and add validation here that compares them. Log a warning when stored routes are rejected due to identifier mismatch.
🧹 Nitpick comments (1)
CHUNKS.md (1)
100-117: Specify language for fenced code block.The fenced code block should specify a language identifier for proper syntax highlighting.
Based on static analysis.
Apply this diff:
-``` +```text Metadata -- +---------+-------+
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
CHUNKS.md(1 hunks)include/fluent-bit/flb_input_chunk.h(3 hunks)lib/chunkio/src/cio_file.c(1 hunks)src/flb_input_chunk.c(8 hunks)src/flb_input_log.c(2 hunks)src/flb_task.c(3 hunks)tests/internal/CMakeLists.txt(1 hunks)tests/internal/input_chunk_routes.c(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- include/fluent-bit/flb_input_chunk.h
- tests/internal/CMakeLists.txt
🧰 Additional context used
🧬 Code graph analysis (4)
src/flb_input_log.c (3)
src/flb_routes_mask.c (1)
flb_routes_mask_get_bit(125-140)src/flb_input_chunk.c (2)
flb_input_chunk_write_header_v2(1075-1162)flb_input_chunk_get_name(2485-2491)include/fluent-bit/flb_mem.h (1)
flb_free(126-128)
src/flb_task.c (4)
src/flb_input_chunk.c (3)
flb_input_chunk_has_direct_routes(1164-1187)flb_input_chunk_get_direct_routes(1189-1251)flb_input_chunk_get_name(2485-2491)include/fluent-bit/flb_mem.h (2)
flb_calloc(84-96)flb_free(126-128)src/flb_router_condition.c (1)
flb_router_chunk_context_destroy(187-190)src/flb_input.c (1)
flb_input_name(790-797)
tests/internal/input_chunk_routes.c (2)
include/fluent-bit/flb_mem.h (1)
flb_free(126-128)src/flb_input_chunk.c (4)
flb_input_chunk_write_header_v2(1075-1162)flb_input_chunk_has_direct_routes(1164-1187)flb_input_chunk_get_direct_routes(1189-1251)flb_input_chunk_get_tag(2535-2552)
src/flb_input_chunk.c (2)
src/flb_routes_mask.c (2)
flb_routes_mask_set_bit(76-91)flb_routes_mask_set_by_tag(31-67)include/fluent-bit/flb_mem.h (2)
flb_free(126-128)flb_calloc(84-96)
🪛 GitHub Actions: Run unit tests
lib/chunkio/src/cio_file.c
[error] 1-1: Unit test suite reported failures: 1 of 169 tests failed (flb-it-input_chunk_routes).
tests/internal/input_chunk_routes.c
[error] 109-109: Test 'flb-it-input_chunk_routes' failed. Test chunk_metadata_direct_routes... Check ret == CIO_OK... failed.
🪛 markdownlint-cli2 (0.18.1)
CHUNKS.md
100-100: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: run-windows-unit-tests / call-build-windows-unit-test (Windows 64bit, x64, x64-windows-static, 3.31.6, windows-latest)
- GitHub Check: run-windows-unit-tests / call-build-windows-unit-test (Windows 32bit, x86, x86-windows-static, 3.31.6, windows-latest)
- GitHub Check: run-windows-unit-tests / call-build-windows-unit-test (Windows 64bit (Arm64), amd64_arm64, -DCMAKE_SYSTEM_NAME=Windows -DC...
🔇 Additional comments (5)
lib/chunkio/src/cio_file.c (1)
1109-1109: LGTM: Correct data size used for content relocation.The change to use
cf->data_sizeinstead ofsizewhen moving content is correct. When metadata space is sufficient (meta_av >= size), we need to relocate the existing content data (whose size iscf->data_size), not the new metadata size.src/flb_input_log.c (1)
205-270: Direct routing persistence logic is well-structured.The implementation correctly:
- Pre-counts eligible routes before allocation
- Allocates appropriately-sized buffer for output IDs
- Casts output IDs to
uint16_tmatching the stored format (Line 248)- Calls
flb_input_chunk_write_header_v2to persist metadata- Frees allocated memory on all paths
- Logs warnings on write failure without blocking the operation
src/flb_input_chunk.c (2)
65-171: Metadata view implementation is well-structured.The new
flb_input_chunk_meta_viewstructure andinput_chunk_metadata_viewfunction provide a clean abstraction for parsing chunk metadata. The code correctly:
- Validates magic bytes
- Extracts tag with proper null-terminator handling
- Parses routing payload when flags indicate direct routes
- Handles legacy format gracefully (flags = 0)
1075-1251: Direct routing API functions are correctly implemented.The new API functions:
flb_input_chunk_write_header_v2: Properly encodes routing metadata with big-endian 16-bit IDsflb_input_chunk_has_direct_routes: Validates flags and payload presenceflb_input_chunk_get_direct_routes: Decodes routing payload with proper bounds checkingMemory management, error handling, and boundary validation are all correct.
tests/internal/input_chunk_routes.c (1)
108-109: I've completed a thorough analysis of the code path for this test. Here's what I found:Code Review Summary:
The test callscio_chunk_up_force()on a freshly created chunk with proper initialization. The function chain is:
cio_chunk_up_force()→cio_file_up_force()→_cio_file_up(ch, CIO_FALSE)_cio_file_up()opens the file withO_RDWR | O_CREAT, gets its size, and memory-maps it- For an empty file (size 0) with
CIO_OPEN_RWflag (which is set), it callscio_file_resize()to allocate space, thencio_file_native_map()Findings:
The code logic appears sound for the normal case. The test setup correctly:
- Deletes and recreates the test path
- Creates a fresh context, stream, and chunk
- Opens the chunk with proper flags
Potential Failure Points:
Ifcio_chunk_up_force()returns non-OK, it would be from:
cio_file_native_open()failure (file permissions or /tmp unavailable)cio_file_resize()failure (disk space or permissions)cio_file_native_map()failure (mmap syscall failure)Limitation:
I cannot compile or execute the test in the sandbox environment to reproduce the failure. The issue appears to be CI environment-specific or caused by a recent code change affecting file operations.
c9b7465 to
d42742b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
CHUNKS.md (1)
103-124: Add language identifier to fenced code block.The ASCII diagram fenced code block should specify a language identifier (even if it's just
text) to satisfy markdown linting rules and improve rendering consistency.Apply this diff:
-``` +```text Metadata -- +---------+-------+As per coding guidelines (static analysis hint MD040).
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
CHUNKS.md(1 hunks)include/fluent-bit/flb_input_chunk.h(3 hunks)lib/chunkio/src/cio_file.c(1 hunks)src/flb_input_chunk.c(8 hunks)src/flb_input_log.c(2 hunks)src/flb_task.c(4 hunks)tests/internal/CMakeLists.txt(1 hunks)tests/internal/input_chunk_routes.c(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
- tests/internal/input_chunk_routes.c
- tests/internal/CMakeLists.txt
- lib/chunkio/src/cio_file.c
🧰 Additional context used
🧬 Code graph analysis (4)
src/flb_task.c (4)
src/flb_input_chunk.c (4)
flb_input_chunk_has_direct_routes(1298-1321)flb_input_chunk_get_direct_routes(1323-1471)flb_input_chunk_destroy_direct_routes(1473-1491)flb_input_chunk_get_name(2725-2731)include/fluent-bit/flb_mem.h (1)
flb_calloc(84-96)src/flb_router_condition.c (1)
flb_router_chunk_context_destroy(187-190)src/flb_input.c (1)
flb_input_name(790-797)
include/fluent-bit/flb_input_chunk.h (1)
src/flb_input_chunk.c (4)
flb_input_chunk_write_header_v2(1119-1296)flb_input_chunk_has_direct_routes(1298-1321)flb_input_chunk_get_direct_routes(1323-1471)flb_input_chunk_destroy_direct_routes(1473-1491)
src/flb_input_log.c (3)
src/flb_routes_mask.c (1)
flb_routes_mask_get_bit(125-140)include/fluent-bit/flb_mem.h (2)
flb_calloc(84-96)flb_free(126-128)src/flb_input_chunk.c (2)
flb_input_chunk_write_header_v2(1119-1296)flb_input_chunk_get_name(2725-2731)
src/flb_input_chunk.c (2)
src/flb_routes_mask.c (2)
flb_routes_mask_set_bit(76-91)flb_routes_mask_set_by_tag(31-67)include/fluent-bit/flb_mem.h (2)
flb_calloc(84-96)flb_free(126-128)
🪛 markdownlint-cli2 (0.18.1)
CHUNKS.md
103-103: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (30)
- GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit (Arm64), amd64_arm64, -DCMAKE_SYSTEM_NAME=Windows -DCMA...
- GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit, x64, x64-windows-static, 3.31.6)
- GitHub Check: pr-windows-build / call-build-windows-package (Windows 32bit, x86, x86-windows-static, 3.31.6)
- GitHub Check: PR - fuzzing test
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_COVERAGE=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_ARROW=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, gcc, g++)
- GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-24.04, clang-14)
- GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-24.04, clang-14)
- GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-22.04, clang-12)
- GitHub Check: pr-compile-without-cxx (3.31.6)
- GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-22.04, clang-12)
- GitHub Check: pr-compile-centos-7
🔇 Additional comments (8)
src/flb_input_log.c (1)
207-289: Direct route persistence implementation looks solid.The two-pass collection approach properly:
- Counts routes before allocation to size the array correctly
- Prefers output alias over name for stable labels
- Bounds-checks label lengths against
UINT16_MAX- Uses
uint16_tfor route IDs, avoiding the truncation issue from past reviews- Includes appropriate warning when persistence fails
- Cleans up allocated memory in all paths
include/fluent-bit/flb_input_chunk.h (1)
28-135: LGTM! Clean API surface for direct route handling.The new declarations properly:
- Use
uint16_tfor route IDs (avoiding truncation)- Include stable textual labels alongside numeric IDs
- Provide forward declaration for
struct cio_chunk- Define consistent flag bits for metadata
- Expose complete lifecycle management (write, check, get, destroy)
src/flb_task.c (2)
75-132: Excellent stable output resolution strategy.The helper properly resolves outputs using a three-tier strategy:
- Match by alias (if label present)
- Match by name (if label present)
- Fallback to numeric ID
This approach ensures chunks continue routing to the intended destination even when output ordering changes between restarts, addressing the concerns from previous reviews about output ID instability.
503-596: Robust direct route restoration with proper validation and fallback.The restoration logic appropriately:
- Pre-validates all stored routes before creating any task routes (lines 510-522)
- Uses atomic semantics: either all routes resolve successfully or none are used
- Falls back gracefully to configured routes when any stored route is missing (lines 567-582)
- Provides actionable warning messages including both numeric ID and textual label (lines 568-574)
- Cleans up allocated resources in all code paths (lines 554-560, 586-600)
- Returns early after successful restoration to avoid duplicate route creation (lines 585-596)
This addresses the previous review concerns about validating stable identifiers and handling config drift.
src/flb_input_chunk.c (4)
65-171: Well-structured metadata view abstraction.The new
flb_input_chunk_meta_viewstructure and parsing function centralize metadata extraction logic, properly:
- Distinguishing legacy format (no magic bytes) from extended format
- Extracting tag with null-terminator detection (lines 136-139)
- Parsing routing payload when direct-routes flag is set (lines 141-168)
- Returning distinct codes for missing magic vs. incomplete routing data (0 vs. 1)
- Avoiding buffer overruns with careful offset arithmetic (lines 154-164)
This provides a clean abstraction for callers like
flb_input_chunk_get_tag.
1119-1296: Solid metadata writer with proper encoding and bounds checking.The v2 header writer correctly:
- Pre-computes label lengths and detects overflow (lines 1171-1195)
- Validates total routing payload fits in
uint16_t(lines 1204-1207)- Truncates tag if necessary to fit in available metadata space (lines 1210-1216)
- Encodes all multi-byte fields in big-endian (lines 1252-1253, 1256-1257, 1261-1262, 1270-1271)
- Sets flags byte appropriately based on label presence (lines 1242-1246)
- Includes null terminator after tag (line 1249)
- Properly cleans up on all error paths (lines 1222-1223, 1287-1289)
1323-1471: Thorough bounds checking in direct route reader.The reader includes comprehensive validation:
- Checks minimum routing payload size before proceeding (lines 1364-1370, 1380-1382)
- Validates label section size when labels flag set (lines 1386-1392)
- Checks each label length against remaining bytes (lines 1441-1445)
- Allocates label buffers with space for null terminator (line 1447)
- Null-terminates all labels (line 1458)
- Returns distinct error codes: -1 for allocation failure, -2 for corrupt data
- Cleans up partial results on any error (lines 1413, 1429, 1443, 1451)
This defensive parsing prevents buffer overruns when reading potentially corrupt or malicious metadata.
956-1038: Proper direct route integration with fallback to tag routing.The loading logic in
flb_input_chunk_mapappropriately:
- Attempts to load direct routes from metadata (line 961)
- Pre-validates all routes resolve to existing outputs (lines 963-975)
- Uses the same stable lookup strategy as task creation (alias → name → id)
- Clears routes_mask and populates it atomically only when all routes valid (lines 977-992)
- Warns with actionable details when any route missing (lines 995-1000, 1004-1006)
- Falls back to tag-based routing when direct routes invalid or produce no routes (lines 1022-1038)
- Cleans up allocated route structures in all paths (lines 1009-1012)
This ensures backlog chunks always route somewhere, even when config has changed.
a1142c8 to
dd92c0f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
CHUNKS.md (1)
107-149: Add language identifier to the fenced diagram.Please mark the code fence with a language (e.g., text) for proper rendering.
Apply:
-``` +```text ... -``` +```
🧹 Nitpick comments (5)
CHUNKS.md (1)
68-71: Docs contradict new flags; update v1.9 paragraph.This still says “the fourth byte is unused,” but below you introduce feature flags in that byte. Update this paragraph to reflect the flags usage and legacy behavior when the byte is zero.
- specifies the type of records the Chunk is storing, for Logs this value is 0x0 and for Metrics is 0x1. The four byte is unused for now. + specifies the type of records the Chunk is storing (e.g., Logs=0x0, Metrics=0x1). The fourth byte carries feature flags (0 when legacy layout is used).src/flb_task.c (2)
76-209: Deduplicate output-resolution helper.task_collect_output_references duplicates input_chunk_collect_output_references with near-identical logic. Extract a shared utility (e.g., in flb_router or flb_output) and call it from both sites to avoid drift.
684-691: Include plugin in warning for missing stored route (operator visibility).Add stored plugin name (when present) to the warning to make config drift obvious.
- flb_warn("[task] input=%s/%s stored direct route id=%u label=%.*s not found for chunk %s, falling back to configured routes", + flb_warn("[task] input=%s/%s stored direct route id=%u label=%.*s%s%s not found for chunk %s, falling back to configured routes", i_ins->p->name, flb_input_name(i_ins), (unsigned int) missing_output_id, (int) missing_output_label_length, - missing_output_label ? missing_output_label : "", + missing_output_label ? missing_output_label : "", + direct_routes[direct_output_index].plugin_name ? " plugin=" : "", + direct_routes[direct_output_index].plugin_name ? direct_routes[direct_output_index].plugin_name : "", flb_input_chunk_get_name(task_ic));tests/internal/input_chunk_routes.c (1)
368-597: Add a negative test for plugin-type mismatch.Create a chunk with label “shared”, plugin “stdout”; then configure an output “shared” with plugin “elasticsearch” and assert it’s not matched (routes_mask bit remains 0). This guards the plugin-type validation path.
src/flb_input_chunk.c (1)
756-889: Unify output-resolution logic with task side.input_chunk_collect_output_references and task_collect_output_references have duplicated code. Factor into a single shared helper to ensure identical behavior and easier maintenance.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
CHUNKS.md(2 hunks)include/fluent-bit/flb_input_chunk.h(3 hunks)src/flb_input_chunk.c(7 hunks)src/flb_input_log.c(2 hunks)src/flb_task.c(6 hunks)tests/internal/CMakeLists.txt(1 hunks)tests/internal/input_chunk_routes.c(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- tests/internal/CMakeLists.txt
🧰 Additional context used
🧠 Learnings (8)
📚 Learning: 2025-08-29T06:25:02.561Z
Learnt from: shadowshot-x
Repo: fluent/fluent-bit PR: 10794
File: tests/internal/aws_compress.c:7-7
Timestamp: 2025-08-29T06:25:02.561Z
Learning: In Fluent Bit, ZSTD (zstandard) compression library is bundled directly in the source tree at `lib/zstd-1.5.7` and is built unconditionally as a static library. Unlike optional external dependencies, ZSTD does not use conditional compilation guards like `FLB_HAVE_ZSTD` and is always available. Headers like `<fluent-bit/flb_zstd.h>` can be included directly without guards.
Applied to files:
src/flb_task.c
📚 Learning: 2025-08-31T12:46:11.940Z
Learnt from: ThomasDevoogdt
Repo: fluent/fluent-bit PR: 9277
File: .github/workflows/pr-compile-check.yaml:147-151
Timestamp: 2025-08-31T12:46:11.940Z
Learning: In fluent-bit CMakeLists.txt, the system library preference flags are defined as FLB_PREFER_SYSTEM_LIB_ZSTD and FLB_PREFER_SYSTEM_LIB_KAFKA with the FLB_ prefix.
Applied to files:
src/flb_task.c
📚 Learning: 2025-08-29T06:25:27.250Z
Learnt from: shadowshot-x
Repo: fluent/fluent-bit PR: 10794
File: tests/internal/aws_compress.c:93-107
Timestamp: 2025-08-29T06:25:27.250Z
Learning: In Fluent Bit, ZSTD compression is enabled by default and is treated as a core dependency, not requiring conditional compilation guards like `#ifdef FLB_HAVE_ZSTD`. Unlike some other optional components such as ARROW/PARQUET (which use `#ifdef FLB_HAVE_ARROW` guards), ZSTD support is always available and doesn't need build-time conditionals. ZSTD headers are included directly without guards across multiple plugins and core components.
Applied to files:
src/flb_task.c
📚 Learning: 2025-08-29T06:24:26.170Z
Learnt from: shadowshot-x
Repo: fluent/fluent-bit PR: 10794
File: tests/internal/aws_compress.c:39-42
Timestamp: 2025-08-29T06:24:26.170Z
Learning: In Fluent Bit, ZSTD compression support is enabled by default and does not require conditional compilation guards (like #ifdef FLB_HAVE_ZSTD) around ZSTD-related code declarations and implementations.
Applied to files:
src/flb_task.c
📚 Learning: 2025-08-29T06:24:44.797Z
Learnt from: shadowshot-x
Repo: fluent/fluent-bit PR: 10794
File: src/aws/flb_aws_compress.c:26-26
Timestamp: 2025-08-29T06:24:44.797Z
Learning: In Fluent Bit, ZSTD support is always available and enabled by default. The build system automatically detects and uses either the system libzstd library or builds the bundled ZSTD version. Unlike other optional dependencies like Arrow which use conditional compilation guards (e.g., FLB_HAVE_ARROW), ZSTD does not require conditional includes or build flags.
Applied to files:
src/flb_task.c
📚 Learning: 2025-08-29T06:24:55.855Z
Learnt from: shadowshot-x
Repo: fluent/fluent-bit PR: 10794
File: src/aws/flb_aws_compress.c:52-56
Timestamp: 2025-08-29T06:24:55.855Z
Learning: ZSTD compression is always available in Fluent Bit and does not require conditional compilation guards. Unlike Arrow/Parquet which use #ifdef FLB_HAVE_ARROW guards, ZSTD is built unconditionally with flb_zstd.c included directly in src/CMakeLists.txt and a bundled ZSTD library at lib/zstd-1.5.7/.
Applied to files:
src/flb_task.c
📚 Learning: 2025-08-29T06:25:27.250Z
Learnt from: shadowshot-x
Repo: fluent/fluent-bit PR: 10794
File: tests/internal/aws_compress.c:93-107
Timestamp: 2025-08-29T06:25:27.250Z
Learning: In Fluent Bit, ZSTD compression is enabled by default and is treated as a core dependency, not requiring conditional compilation guards like `#ifdef FLB_HAVE_ZSTD`. Unlike some other optional components, ZSTD support is always available and doesn't need build-time conditionals.
Applied to files:
src/flb_task.c
📚 Learning: 2025-10-06T17:39:21.606Z
Learnt from: edsiper
Repo: fluent/fluent-bit PR: 10967
File: plugins/in_elasticsearch/in_elasticsearch_bulk_prot.c:266-272
Timestamp: 2025-10-06T17:39:21.606Z
Learning: In the in_elasticsearch plugin, payloads are JSON-based, so tag_key values will always be strings. There's no need to handle binary (FLB_RA_BINARY) values when extracting tags from records in this plugin.
Applied to files:
CHUNKS.md
🧬 Code graph analysis (5)
tests/internal/input_chunk_routes.c (8)
src/flb_env.c (2)
flb_env_create(70-93)flb_env_destroy(95-99)src/flb_routes_mask.c (3)
flb_routes_mask_set_size(173-186)flb_routes_mask_get_bit(125-140)flb_routes_empty_mask_destroy(164-171)src/flb_hash_table.c (2)
flb_hash_table_create(99-137)flb_hash_table_destroy(197-215)src/flb_input_chunk.c (7)
flb_input_chunk_write_header_v2(1278-1573)flb_input_chunk_has_direct_routes(1575-1598)flb_input_chunk_get_direct_routes(1600-1905)flb_input_chunk_destroy_direct_routes(1907-1928)flb_input_chunk_get_tag(3212-3229)flb_input_chunk_map(892-1204)flb_input_chunk_destroy(2122-2224)src/flb_input.c (2)
flb_input_instance_init(1091-1429)flb_input_instance_exit(1520-1536)src/flb_kv.c (1)
flb_kv_release(123-133)src/flb_metrics.c (1)
flb_metrics_destroy(193-211)src/flb_config_map.c (1)
flb_config_map_destroy(377-409)
src/flb_input_log.c (3)
src/flb_routes_mask.c (1)
flb_routes_mask_get_bit(125-140)include/fluent-bit/flb_mem.h (2)
flb_calloc(84-96)flb_free(126-128)src/flb_input_chunk.c (2)
flb_input_chunk_write_header_v2(1278-1573)flb_input_chunk_get_name(3162-3168)
src/flb_task.c (4)
src/flb_input_chunk.c (5)
flb_chunk_route_plugin_matches(177-210)flb_input_chunk_has_direct_routes(1575-1598)flb_input_chunk_get_direct_routes(1600-1905)flb_input_chunk_get_name(3162-3168)flb_input_chunk_destroy_direct_routes(1907-1928)include/fluent-bit/flb_mem.h (2)
flb_calloc(84-96)flb_free(126-128)src/flb_router_condition.c (1)
flb_router_chunk_context_destroy(187-190)src/flb_input.c (1)
flb_input_name(790-797)
include/fluent-bit/flb_input_chunk.h (1)
src/flb_input_chunk.c (5)
flb_input_chunk_write_header_v2(1278-1573)flb_chunk_route_plugin_matches(177-210)flb_input_chunk_has_direct_routes(1575-1598)flb_input_chunk_get_direct_routes(1600-1905)flb_input_chunk_destroy_direct_routes(1907-1928)
src/flb_input_chunk.c (2)
include/fluent-bit/flb_mem.h (2)
flb_calloc(84-96)flb_free(126-128)src/flb_routes_mask.c (2)
flb_routes_mask_set_bit(76-91)flb_routes_mask_set_by_tag(31-67)
🪛 markdownlint-cli2 (0.18.1)
CHUNKS.md
107-107: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (30)
- GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit (Arm64), amd64_arm64, -DCMAKE_SYSTEM_NAME=Windows -DCMA...
- GitHub Check: pr-windows-build / call-build-windows-package (Windows 32bit, x86, x86-windows-static, 3.31.6)
- GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit, x64, x64-windows-static, 3.31.6)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_ARROW=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_COVERAGE=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, clang, clang++)
- GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-22.04, clang-12)
- GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-24.04, clang-14)
- GitHub Check: pr-compile-centos-7
- GitHub Check: pr-compile-without-cxx (3.31.6)
- GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-22.04, clang-12)
- GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-24.04, clang-14)
- GitHub Check: PR - fuzzing test
🔇 Additional comments (5)
src/flb_input_log.c (1)
210-315: Direct-route persistence looks correct.Counting, allocation, label/plugin capture, wide IDs, and header write are all handled with bounds checks and proper cleanup. LGTM.
src/flb_task.c (1)
579-713: Restore logic is safe and precise.You validate labels first, plugin type via flb_chunk_route_plugin_matches, and only use numeric IDs when no label is stored. Early exit and cleanup paths look correct. LGTM.
src/flb_input_chunk.c (2)
92-171: Metadata view parsing is robust.Magic check, tag terminator handling, bounds checks, and routing slice setup are correct, including partial/legacy cases. LGTM.
1278-1573: Header v2 writer looks solid.Wide ID handling, payload sizing, tag truncation against routing, BE encoding, alias-flag bit packing, and error cleanup are correct. LGTM.
include/fluent-bit/flb_input_chunk.h (1)
48-66: Public API and flags look consistent.Flags, structures, and prototypes align with the implementation and tests.
Also applies to: 136-148
Signed-off-by: Eduardo Silva <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (3)
tests/internal/input_chunk_routes.c (3)
21-23: Consider using temporary directory APIs for better portability.The hardcoded
/tmppaths work for most Unix systems but could cause issues on systems where/tmpis not available or when running tests in parallel. Consider using platform-specific temporary directory functions or environment variables.
220-367: Consider adopting the goto cleanup pattern for consistency.This test uses inline cleanup and early returns, while
test_chunk_restore_alias_plugin_match_multipleandtest_chunk_restore_alias_plugin_null_matches_alluse a goto cleanup pattern. Using a consistent approach across all tests improves maintainability and reduces the risk of resource leaks if the test framework behavior changes.
68-68: Consider defining named constants for test parameters.The magic numbers (64 for route mask size, 512 for hash table size, 1024 for chunk size) used throughout the tests would benefit from named constants that explain their purpose and make future adjustments easier.
For example:
#define TEST_ROUTE_MASK_SIZE 64 #define TEST_HASH_TABLE_SIZE 512 #define TEST_CHUNK_SIZE 1024Also applies to: 98-98
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
tests/internal/input_chunk_routes.c(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
tests/internal/input_chunk_routes.c (8)
src/flb_env.c (2)
flb_env_create(70-93)flb_env_destroy(95-99)src/flb_routes_mask.c (3)
flb_routes_mask_set_size(173-186)flb_routes_mask_get_bit(125-140)flb_routes_empty_mask_destroy(164-171)src/flb_hash_table.c (2)
flb_hash_table_create(99-137)flb_hash_table_destroy(197-215)src/flb_input_chunk.c (7)
flb_input_chunk_write_header_v2(1278-1573)flb_input_chunk_has_direct_routes(1575-1598)flb_input_chunk_get_direct_routes(1600-1905)flb_input_chunk_destroy_direct_routes(1907-1928)flb_input_chunk_get_tag(3212-3229)flb_input_chunk_map(892-1204)flb_input_chunk_destroy(2122-2224)src/flb_input.c (2)
flb_input_instance_init(1091-1429)flb_input_instance_exit(1520-1536)src/flb_kv.c (1)
flb_kv_release(123-133)src/flb_metrics.c (1)
flb_metrics_destroy(193-211)src/flb_config_map.c (1)
flb_config_map_destroy(377-409)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (30)
- GitHub Check: pr-windows-build / call-build-windows-package (Windows 32bit, x86, x86-windows-static, 3.31.6)
- GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit (Arm64), amd64_arm64, -DCMAKE_SYSTEM_NAME=Windows -DCMA...
- GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit, x64, x64-windows-static, 3.31.6)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_ARROW=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_COVERAGE=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, clang, clang++)
- GitHub Check: pr-compile-without-cxx (3.31.6)
- GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-22.04, clang-12)
- GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-24.04, clang-14)
- GitHub Check: PR - fuzzing test
- GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-24.04, clang-14)
- GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-22.04, clang-12)
- GitHub Check: pr-compile-centos-7
🔇 Additional comments (6)
tests/internal/input_chunk_routes.c (6)
25-50: LGTM!The msgpack payload construction is clean and properly manages resources with msgpack_sbuffer_destroy.
52-131: LGTM!The initialization logic properly handles the complex setup with cascading error cleanup for the four hash tables. The error path for
flb_routes_mask_set_sizecorrectly cleans up the environment.
133-172: LGTM!The add/cleanup pair properly manages output instance allocation and cleanup, with correct error handling for the alias allocation.
174-218: LGTM!The legacy metadata writing function correctly handles all event types and properly manages the temporary buffer allocation and cleanup.
839-844: LGTM!The test registration correctly includes all three test functions with proper null termination.
479-480: ****The code is correct as written.
cmt_initialize()initializes platform-level atomic operation primitives (pthread mutex or critical sections) as one-time global state and requires no corresponding cleanup function. The function returns early if already initialized on generic implementations, and on GCC/Clang/MSVC 64-bit it's a no-op. Both test functions properly clean up their metric instances withcmt_destroy(in.cmt)in the cleanup section, which is the correct and only required cleanup for the cmetrics API.Likely an incorrect or invalid review comment.
Signed-off-by: Eduardo Silva <[email protected]>
Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.
Summary by CodeRabbit
New Features
Documentation
Improvements
Tests