Conversation
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (2)
src/flb_mp.c (1)
1629-1631: Minor: signed/unsigned comparison in loop.The loop variable
iis declared asint(line 1535) butremove_countissize_t. While unlikely to cause issues in practice, consider usingsize_tforifor consistency.♻️ Suggested fix
- for (i = 0; i < remove_count; i++) { + for (index = 0; index < remove_count; index++) { + flb_mp_chunk_cobj_record_destroy(chunk_cobj, remove_list[index]); - flb_mp_chunk_cobj_record_destroy(chunk_cobj, remove_list[i]); }Note: You could reuse the existing
indexvariable (line 1538) instead ofifor this loop.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/flb_mp.c` around lines 1629 - 1631, The loop uses an int `i` to compare against a size_t `remove_count`, causing a signed/unsigned mismatch; update the loop to use a size_t for the iterator (or reuse the existing size_t `index` variable) when calling flb_mp_chunk_cobj_record_destroy(chunk_cobj, remove_list[i]) so the types are consistent and the signed/unsigned comparison is removed.tests/internal/processor.c (1)
504-507: Dead code in error handling block.Lines 505-507 contain unreachable code. The outer condition checks
owner_input.cmt == NULL, but then line 505 checksowner_input.cmt != NULLbefore callingcmt_destroy. The inner condition can never be true within this block.🧹 Proposed fix
TEST_CHECK(owner_input.cmt != NULL); if (owner_input.cmt == NULL) { - if (owner_input.cmt != NULL) { - cmt_destroy(owner_input.cmt); - } flb_processor_destroy(proc); flb_config_exit(config); flb_sds_destroy(regex_prop_key); return; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/internal/processor.c` around lines 504 - 507, The error-handling block contains dead code: when checking if owner_input.cmt == NULL you then test owner_input.cmt != NULL before calling cmt_destroy, which can never be true; fix by removing the unreachable inner condition and ensure proper semantics—either change the outer condition to if (owner_input.cmt != NULL) and call cmt_destroy(owner_input.cmt), or keep the outer NULL-check and omit the cmt_destroy call—update the branch around owner_input.cmt and the cmt_destroy call accordingly to reflect the intended cleanup path.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/runtime/counter_parity_e2e.c`:
- Line 391: The flb_input_set/flb_filter_set/flb_output_set calls (e.g., the
call to flb_input_set(ctx, in_ffd, "tag", "test", NULL)) currently ignore return
values; update each such call (including the ones around lines referenced:
flb_input_set, flb_filter_set, flb_output_set) to check the returned status and
assert or fail the test when it indicates error. Specifically, capture the
return value of flb_input_set/flb_filter_set/flb_output_set, and if it is
non-zero (or indicates failure per the API), call the test failure/assertion
path (e.g., fail the test or use the test framework's assert) with a descriptive
message so a rejected property causes the test to abort rather than proceed with
defaults.
- Around line 92-105: The helper get_counter_value_1_or_zero currently treats
any non-zero return from get_counter_value_1 as a missing metric and coerces
*value to 0.0; change the error check so only the specific "metric not found"
code is accepted by replacing "if (ret != 0)" with "if (ret == -1)" so other
errors still propagate (i.e., leave other non -1 returns as failures). Make the
same change in the other equivalent helper functions in this file (the two other
occurrences mentioned) so only the -1 "metric not found" code is coerced to 0.0
and all other errors cause the test to fail.
---
Nitpick comments:
In `@src/flb_mp.c`:
- Around line 1629-1631: The loop uses an int `i` to compare against a size_t
`remove_count`, causing a signed/unsigned mismatch; update the loop to use a
size_t for the iterator (or reuse the existing size_t `index` variable) when
calling flb_mp_chunk_cobj_record_destroy(chunk_cobj, remove_list[i]) so the
types are consistent and the signed/unsigned comparison is removed.
In `@tests/internal/processor.c`:
- Around line 504-507: The error-handling block contains dead code: when
checking if owner_input.cmt == NULL you then test owner_input.cmt != NULL before
calling cmt_destroy, which can never be true; fix by removing the unreachable
inner condition and ensure proper semantics—either change the outer condition to
if (owner_input.cmt != NULL) and call cmt_destroy(owner_input.cmt), or keep the
outer NULL-check and omit the cmt_destroy call—update the branch around
owner_input.cmt and the cmt_destroy call accordingly to reflect the intended
cleanup path.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 25be595b-105e-473c-9c51-78049f2f175a
📒 Files selected for processing (12)
include/fluent-bit/flb_mp.hinclude/fluent-bit/flb_mp_chunk.hinclude/fluent-bit/flb_output.hinclude/fluent-bit/flb_processor.hplugins/out_counter/counter.csrc/flb_lib.csrc/flb_mp.csrc/flb_processor.ctests/internal/processor.ctests/runtime/counter_parity_e2e.ctests/runtime/group_counter_semantics.ctests/runtime/out_counter.c
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (2)
tests/runtime/counter_parity_e2e.c (2)
187-190: Redundant null-termination after snprintf.
snprintfalready guarantees null-termination whensize > 0, so the explicit null-termination on line 190 is unnecessary. However, this is harmless defensive coding.🔧 Optional simplification
- snprintf(stage_label, sizeof(stage_label) - 1, "%zu", pu->stage); - stage_label[sizeof(stage_label) - 1] = '\0'; + snprintf(stage_label, sizeof(stage_label), "%zu", pu->stage);The same applies to lines 210-211.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/runtime/counter_parity_e2e.c` around lines 187 - 190, snprintf already null-terminates when size > 0, so remove the redundant explicit assignment stage_label[sizeof(stage_label) - 1] = '\0'; after the snprintf call that formats pu->stage into stage_label; do the same for the other identical occurrences (the analogous buffer null-termination lines around the later snprintf calls) to simplify the code while keeping the snprintf size argument unchanged.
742-833: Timing-sensitive assertions may be flaky on slow CI systems.The test waits 1200ms after injection before checking counters. With a 200ms flush interval and retry backoff, slower systems might not complete the expected retry cycles in time.
Consider either:
- Increasing the sleep duration with some margin
- Adding a polling loop that waits for the expected counter values with a timeout
This is a minor concern since the current timing should work in most environments.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/runtime/counter_parity_e2e.c` around lines 742 - 833, The timing in flb_test_retry_scheduled_route_parity_grouped is brittle: the fixed flb_time_msleep(1200) may be too short on slow CI; replace it with either a longer sleep (e.g., increase 1200ms to a safer value such as 3000ms) or, preferably, implement a polling loop after inject_grouped_log_chunk that repeatedly calls get_counter_value_1/get_counter_value_1_or_zero/get_counter_value_1_or_zero/get_counter_value_2_or_zero (using the same o_ins->cmt_proc_records, o_ins->cmt_retries, o_ins->cmt_retried_records, ctx->config->router->logs_records_total and flb_input_name/flb_output_name) until the expected conditions (output_proc_records == 0.0, output_retries >= 1.0, output_retried_records >= 1.0, router_records == 0.0) are met or a timeout (e.g., 5s) elapses, then assert or fail; this makes the test resilient to slow machines while keeping logic around inject_grouped_log_chunk, flb_time_msleep, and the get_counter_value_* calls intact.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/flb_mp.c`:
- Around line 231-306: flb_mp_chunk_cobj_normalize_groups(chunk_cobj) return
value is ignored; update flb_mp_normalize_log_buffer_groups to capture its
return, and if it indicates failure (non-zero), perform the same cleanup
sequence used elsewhere (flb_mp_chunk_cobj_destroy,
flb_log_event_encoder_destroy, flb_log_event_decoder_destroy) and return -1 so
we don't proceed to flb_mp_chunk_cobj_encode with an unnormalized chunk; locate
the call to flb_mp_chunk_cobj_normalize_groups in
flb_mp_normalize_log_buffer_groups and add the conditional check and
cleanup/return.
In `@tests/internal/processor.c`:
- Around line 552-556: The cleanup code should avoid double-free when out_buf
aliases mp_buf: change the free logic so that out_buf is only flb_free'd if it
is non-NULL and not equal to mp_buf (i.e., if (out_buf != NULL && out_buf !=
mp_buf) flb_free(out_buf); out_buf = NULL;), then always flb_free(mp_buf); this
mirrors the guard used in processor_grouped_filter_counters and processor and
prevents freeing the same buffer twice after flb_processor_run returns the
original buffer.
---
Nitpick comments:
In `@tests/runtime/counter_parity_e2e.c`:
- Around line 187-190: snprintf already null-terminates when size > 0, so remove
the redundant explicit assignment stage_label[sizeof(stage_label) - 1] = '\0';
after the snprintf call that formats pu->stage into stage_label; do the same for
the other identical occurrences (the analogous buffer null-termination lines
around the later snprintf calls) to simplify the code while keeping the snprintf
size argument unchanged.
- Around line 742-833: The timing in
flb_test_retry_scheduled_route_parity_grouped is brittle: the fixed
flb_time_msleep(1200) may be too short on slow CI; replace it with either a
longer sleep (e.g., increase 1200ms to a safer value such as 3000ms) or,
preferably, implement a polling loop after inject_grouped_log_chunk that
repeatedly calls
get_counter_value_1/get_counter_value_1_or_zero/get_counter_value_1_or_zero/get_counter_value_2_or_zero
(using the same o_ins->cmt_proc_records, o_ins->cmt_retries,
o_ins->cmt_retried_records, ctx->config->router->logs_records_total and
flb_input_name/flb_output_name) until the expected conditions
(output_proc_records == 0.0, output_retries >= 1.0, output_retried_records >=
1.0, router_records == 0.0) are met or a timeout (e.g., 5s) elapses, then assert
or fail; this makes the test resilient to slow machines while keeping logic
around inject_grouped_log_chunk, flb_time_msleep, and the get_counter_value_*
calls intact.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 8609c484-e4fe-47a4-8ea6-6b836a2699c9
📒 Files selected for processing (3)
src/flb_mp.ctests/internal/processor.ctests/runtime/counter_parity_e2e.c
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
6e77abb to
270cd24
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
tests/internal/processor.c (1)
498-501: Redundant null-termination aftersnprintf.
snprintfalready guarantees null-termination within the specified buffer size. Line 501 is unnecessary.♻️ Minor simplification
memset(&owner_input, 0, sizeof(owner_input)); snprintf(owner_input.name, sizeof(owner_input.name) - 1, "%s", "unit_input.0"); - owner_input.name[sizeof(owner_input.name) - 1] = '\0';🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/internal/processor.c` around lines 498 - 501, Remove the redundant explicit null-termination after the snprintf call: the snprintf call that writes into owner_input.name already guarantees NUL-termination when given sizeof(owner_input.name) - 1 as the limit, so delete the line owner_input.name[sizeof(owner_input.name) - 1] = '\0'; and keep the memset and snprintf as-is to simplify the code around owner_input and owner_input.name.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/runtime/group_counter_semantics.c`:
- Around line 795-806: The test leaks the processor if
flb_processor_unit_create(proc, ...) fails: when pu is NULL you must destroy the
previously created processor before returning. Update the error path after
flb_processor_unit_create to call flb_processor_destroy(proc) (referencing proc
and pu) prior to flb_destroy(ctx) and the return so the processor is properly
freed on the early exit.
---
Nitpick comments:
In `@tests/internal/processor.c`:
- Around line 498-501: Remove the redundant explicit null-termination after the
snprintf call: the snprintf call that writes into owner_input.name already
guarantees NUL-termination when given sizeof(owner_input.name) - 1 as the limit,
so delete the line owner_input.name[sizeof(owner_input.name) - 1] = '\0'; and
keep the memset and snprintf as-is to simplify the code around owner_input and
owner_input.name.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 89230ab5-786b-48bf-b455-d4a03b738ea1
📒 Files selected for processing (4)
src/flb_mp.ctests/internal/processor.ctests/runtime/counter_parity_e2e.ctests/runtime/group_counter_semantics.c
🚧 Files skipped from review as they are similar to previous changes (1)
- tests/runtime/counter_parity_e2e.c
| proc = flb_processor_create(ctx->config, "unit_test", NULL, 0); | ||
| TEST_CHECK(proc != NULL); | ||
| if (proc == NULL) { | ||
| flb_destroy(ctx); | ||
| return; | ||
| } | ||
|
|
||
| pu = flb_processor_unit_create(proc, FLB_PROCESSOR_LOGS, "grep"); | ||
| TEST_CHECK(pu != NULL); | ||
| if (pu == NULL) { | ||
| flb_destroy(ctx); | ||
| return; |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Description: Check if flb_destroy handles processors attached to outputs
# Search for processor cleanup in flb_destroy or flb_output cleanup paths
rg -n 'flb_processor_destroy|processor.*destroy' src/flb_output.c src/flb_lib.c src/flb_config.c --context=3Repository: fluent/fluent-bit
Length of output: 815
🏁 Script executed:
cd tests/runtime && cat -n group_counter_semantics.c | sed -n '795,820p'Repository: fluent/fluent-bit
Length of output: 978
🏁 Script executed:
# Check how processor is used in the test - does it get attached to output?
rg -A 10 'flb_processor_unit_create' tests/runtime/group_counter_semantics.cRepository: fluent/fluent-bit
Length of output: 392
🏁 Script executed:
# Verify what flb_destroy does with processors
rg -B 5 -A 15 'flb_destroy.*{' src/flb_lib.c | head -80Repository: fluent/fluent-bit
Length of output: 152
🏁 Script executed:
# Check if processor is registered with config/ctx on creation
rg -A 20 'flb_processor_create' src/flb_processor.c | head -40Repository: fluent/fluent-bit
Length of output: 693
🏁 Script executed:
# Check if flb_destroy handles all processors in config
rg -B 5 -A 20 'void flb_destroy' src/flb_lib.cRepository: fluent/fluent-bit
Length of output: 709
🏁 Script executed:
# Check what flb_config_exit does - does it clean up all processors?
rg -B 5 -A 30 'void flb_config_exit' src/flb_config.cRepository: fluent/fluent-bit
Length of output: 714
🏁 Script executed:
# Check if processors are registered in config->processors list or similar
rg -n 'config->processors' src/ | head -20Repository: fluent/fluent-bit
Length of output: 43
Processor leaked on early return at line 806.
If flb_processor_unit_create() fails at line 802, the processor created at line 795 is not destroyed before the early return at line 806. Processor cleanup only occurs when attached to an output (line 812), so returning before attachment leaves the processor unfreed. Call flb_processor_destroy(proc) before the early return at line 806.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/runtime/group_counter_semantics.c` around lines 795 - 806, The test
leaks the processor if flb_processor_unit_create(proc, ...) fails: when pu is
NULL you must destroy the previously created processor before returning. Update
the error path after flb_processor_unit_create to call
flb_processor_destroy(proc) (referencing proc and pu) prior to flb_destroy(ctx)
and the return so the processor is properly freed on the early exit.
This PR fixes metrics/counter correctness when grouped logs are present.
Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.
Summary by CodeRabbit
New Features
Bug Fixes
Tests