-
Notifications
You must be signed in to change notification settings - Fork 1.8k
config: multiline: in_tail: filter_multiline: Add configurable buffer limit for multiline interface #10653
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
config: multiline: in_tail: filter_multiline: Add configurable buffer limit for multiline interface #10653
Conversation
578f46a
to
3a8abd9
Compare
WalkthroughAdds a configurable multiline buffer limit and binary-size parser; threads a FLB_MULTILINE_TRUNCATED status through multiline processing, enforces per-group buffer limits on append, records truncation via metrics/logs (filter and tail), updates parser creation API, and adds tests for truncation and size parsing. Changes
Sequence Diagram(s)sequenceDiagram
participant Config
participant Tail as TailInput
participant ML as MultilineCore
participant Group as StreamGroup
participant Filter
participant Metrics
Config->>ML: init buffer_limit (string -> bytes)
Tail->>ML: flb_ml_append_text/append_object(data)
ML->>Group: flb_ml_group_cat(data,len)
alt appended fully
Group-->>ML: FLB_MULTILINE_OK
ML->>Filter: emit / process
Filter->>Metrics: inc emitted metric
else truncated or partially appended
Group-->>ML: FLB_MULTILINE_TRUNCATED
ML->>Tail: log warning
ML->>Metrics: inc truncated metric
ML->>Group: mark truncated flag
end
ML->>Group: flush -> include "multiline_truncated": true if set
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Suggested reviewers
Poem
✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
3a8abd9
to
dec20f1
Compare
/* Return codes */ | ||
#define FLB_MULTILINE_OK 0 | ||
#define FLB_MULTILINE_PROCESSED 1 /* Reserved */ | ||
#define FLB_MULTILINE_TRUNCATED 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changing to status code 2 is needed because status code 1 will be collided for FLB_TRUE status.
Signed-off-by: Hiroshi Hatake <[email protected]>
Signed-off-by: Hiroshi Hatake <[email protected]>
2b4bdf1
to
3575cca
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (2)
plugins/filter_multiline/ml.c (2)
387-391
: Typo in metric help string (“occurence” → “occurrences”).Correct the help text.
ctx->cmt_truncated = cmt_counter_create(ins->cmt, "fluentbit", "filter", "emit_truncated_total", - "Total number of truncated occurence of multiline", + "Total number of truncated multiline occurrences", 1, (char *[]) {"name"});
914-925
: Avoid potential double-counting of TRUNCATED (buffered path).If truncation is also surfaced during later flush or other paths, this may count twice. Please confirm invariants that FLB_MULTILINE_TRUNCATED is signaled exactly once per truncated message, per stream.
You can grep other increments to validate uniqueness:
#!/bin/bash rg -nC2 -e 'FLB_MULTILINE_METRIC_TRUNCATED|emit_truncated_total|cmt_truncated|FLB_MULTILINE_TRUNCATED' --type=c
🧹 Nitpick comments (2)
plugins/filter_multiline/ml.c (2)
807-810
: Keep label name const; cast only at call sites.Avoid casting away const on the pointer variable; declare as const char* and cast per API call.
#ifdef FLB_HAVE_METRICS - uint64_t ts; - char *name; + uint64_t ts; + const char *name; #endif- name = (char *) flb_filter_name(ctx->ins); + name = flb_filter_name(ctx->ins); ts = cfl_time_now(); - cmt_counter_inc(ctx->cmt_truncated, ts, 1, (char *[]) {name}); + cmt_counter_inc(ctx->cmt_truncated, ts, 1, (char *[]) {(char *) name});- name = (char *) flb_filter_name(ctx->ins); + name = flb_filter_name(ctx->ins); ts = cfl_time_now(); - cmt_counter_inc(ctx->cmt_truncated, ts, 1, (char *[]) {name}); + cmt_counter_inc(ctx->cmt_truncated, ts, 1, (char *[]) {(char *) name});- name = (char *) flb_filter_name(ctx->ins); + name = flb_filter_name(ctx->ins); ts = cfl_time_now(); - cmt_counter_inc(ctx->cmt_emitted, ts, 1, (char *[]) {name}); + cmt_counter_inc(ctx->cmt_emitted, ts, 1, (char *[]) {(char *) name});Also applies to: 855-858, 918-921, 932-935
926-939
: Streamline conditional ordering for readability (buffered path).Check OK first, then fall back to debug; same behavior, clearer flow.
- else if (ret != FLB_MULTILINE_OK) { - flb_plg_debug(ctx->ins, - "could not append object from tag: %s", tag); - } - else if (ret == FLB_MULTILINE_OK) { + else if (ret == FLB_MULTILINE_OK) { #ifdef FLB_HAVE_METRICS name = (char *) flb_filter_name(ctx->ins); ts = cfl_time_now(); cmt_counter_inc(ctx->cmt_emitted, ts, 1, (char *[]) {name}); /* old api */ flb_metrics_sum(FLB_MULTILINE_METRIC_EMITTED, 1, ctx->ins->metrics); #endif - } + } + else { + flb_plg_debug(ctx->ins, + "could not append object from tag: %s", tag); + }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
plugins/filter_multiline/ml.c
(4 hunks)src/multiline/flb_ml_rule.c
(3 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- src/multiline/flb_ml_rule.c
🧰 Additional context used
🧬 Code graph analysis (1)
plugins/filter_multiline/ml.c (4)
include/fluent-bit/flb_mem.h (1)
flb_free
(126-128)src/flb_metrics.c (2)
flb_metrics_add
(129-178)flb_metrics_sum
(180-191)lib/cmetrics/src/cmt_counter.c (2)
cmt_counter_create
(26-81)cmt_counter_inc
(100-117)src/flb_filter.c (1)
flb_filter_name
(490-497)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (23)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_ARROW=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_COVERAGE=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, clang, clang++)
- GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-24.04, clang-14)
- GitHub Check: PR - fuzzing test
- GitHub Check: pr-compile-centos-7
🔇 Additional comments (2)
plugins/filter_multiline/ml.c (2)
851-862
: Good: warn + metrics on truncation (unbuffered path).The warning plus TRUNCATED metrics update looks correct and matches the new status propagation.
930-939
: Confirm EMITTED metric scope (buffered-only).EMITTED is only created/used when buffered; unbuffered path doesn’t increment it. If parity across modes is desired, mirror it for the unbuffered flow; otherwise, document that emit_records_total reflects buffered emissions only.
if (!ctx->cmt_emitted) { | ||
flb_errno(); | ||
flb_free(ctx); | ||
|
||
return -1; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix early-exit leaks when cmt_emitted creation fails.
If cmt_counter_create() fails, the emitter instance created earlier is not torn down. Clean up ins_emitter before freeing ctx to avoid leaking an input instance.
if (!ctx->cmt_emitted) {
flb_errno();
- flb_free(ctx);
-
- return -1;
+ /* unwind emitter created earlier */
+ if (ctx->ins_emitter) {
+ flb_input_instance_exit(ctx->ins_emitter, ctx->config);
+ flb_input_instance_destroy(ctx->ins_emitter);
+ ctx->ins_emitter = NULL;
+ }
+ flb_free(ctx);
+ return -1;
}
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
if (!ctx->cmt_emitted) { | |
flb_errno(); | |
flb_free(ctx); | |
return -1; | |
} | |
if (!ctx->cmt_emitted) { | |
flb_errno(); | |
/* unwind emitter created earlier */ | |
if (ctx->ins_emitter) { | |
flb_input_instance_exit(ctx->ins_emitter, ctx->config); | |
flb_input_instance_destroy(ctx->ins_emitter); | |
ctx->ins_emitter = NULL; | |
} | |
flb_free(ctx); | |
return -1; | |
} |
🤖 Prompt for AI Agents
In plugins/filter_multiline/ml.c around lines 372 to 377, the code returns early
when cmt_emitted creation fails but never tears down the previously created
emitter instance, leaking an input instance; before flb_free(ctx) call, destroy
the emitter instance (e.g. call flb_input_instance_destroy(ctx->ins_emitter) or
the repo's matching teardown function), set ctx->ins_emitter = NULL, then free
ctx and return -1.
if (!ctx->cmt_truncated) { | ||
flb_errno(); | ||
flb_free(ctx); | ||
|
||
return -1; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix early-exit leaks when cmt_truncated creation fails (both modes).
Failure here can leak either the emitter (buffered mode) or the msgpack sbuffer (unbuffered mode). Unwind before returning.
if (!ctx->cmt_truncated) {
flb_errno();
- flb_free(ctx);
-
- return -1;
+ /* unwind resources allocated before metrics registration */
+ if (ctx->use_buffer == FLB_TRUE && ctx->ins_emitter) {
+ flb_input_instance_exit(ctx->ins_emitter, ctx->config);
+ flb_input_instance_destroy(ctx->ins_emitter);
+ ctx->ins_emitter = NULL;
+ }
+ else if (ctx->use_buffer == FLB_FALSE) {
+ /* destroy msgpack buffer initialized for unbuffered mode */
+ msgpack_sbuffer_destroy(&ctx->mp_sbuf);
+ }
+ flb_free(ctx);
+ return -1;
}
🤖 Prompt for AI Agents
In plugins/filter_multiline/ml.c around lines 391 to 396, the early return when
ctx->cmt_truncated creation fails leaks resources; before returning you must
unwind and free mode-specific resources: call flb_errno() as shown, then if
ctx->out_mode is buffered (emitter created), destroy/free the emitter, otherwise
if unbuffered free/destroy the msgpack sbuffer used; finally free ctx and return
-1. Ensure both branches clean up their respective allocated objects before
freeing ctx and returning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (3)
src/multiline/flb_ml_parser.c (2)
29-41
: Consider using designated initializers for better clarity and safety.The current implementation uses positional field assignments after memset. Using designated initializers would be clearer and more maintainable.
struct flb_ml_parser_params flb_ml_parser_params_default(const char *name) { - struct flb_ml_parser_params p; - memset(&p, 0, sizeof(p)); - - p.size = sizeof(p); - p.name = (char *) name; - p.type = FLB_ML_REGEX; /* sane default */ - p.negate = 0; - p.flush_ms = FLB_ML_FLUSH_TIMEOUT; /* header constant */ - /* other pointers remain NULL by default */ + struct flb_ml_parser_params p = { + .size = sizeof(struct flb_ml_parser_params), + .name = (char *) name, + .type = FLB_ML_REGEX, /* sane default */ + .negate = 0, + .flush_ms = FLB_ML_FLUSH_TIMEOUT, /* header constant */ + /* other fields implicitly zero-initialized */ + }; return p; }
44-51
: Size validation is good, but consider additional parameter validation.The size check
p->size < sizeof(*p)
provides version safety. However, consider validating thetype
field is within expected bounds to prevent potential issues downstream.struct flb_ml_parser *flb_ml_parser_create_params(struct flb_config *ctx, const struct flb_ml_parser_params *p) { struct flb_ml_parser *ml_parser; if (!ctx || !p || p->size < sizeof(*p) || !p->name) { return NULL; } + + /* Validate type is within expected range */ + if (p->type < FLB_ML_REGEX || p->type > FLB_ML_EQ) { + flb_error("[ml_parser] invalid type: %d", p->type); + return NULL; + }include/fluent-bit/multiline/flb_ml_parser.h (1)
39-40
: Document the valid values for the type field.While the comment mentions the valid values, consider adding an explicit note about the valid range for defensive programming.
/* creation parameters (mirror of old positional args) */ char *name; - int type; /* FLB_ML_REGEX / FLB_ML_ENDSWITH / FLB_ML_EQ */ + int type; /* FLB_ML_REGEX / FLB_ML_ENDSWITH / FLB_ML_EQ (valid: 0-2) */ char *match_str; /* used for ENDSWITH/EQ; NULL for REGEX */
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
include/fluent-bit/multiline/flb_ml_parser.h
(1 hunks)src/multiline/flb_ml_parser.c
(3 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
include/fluent-bit/multiline/flb_ml_parser.h (1)
src/multiline/flb_ml_parser.c (2)
flb_ml_parser_params_default
(29-41)flb_ml_parser_create_params
(44-122)
src/multiline/flb_ml_parser.c (2)
include/fluent-bit/flb_mem.h (2)
flb_calloc
(84-96)flb_free
(126-128)src/flb_sds.c (2)
flb_sds_create
(78-90)flb_sds_destroy
(389-399)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (29)
- GitHub Check: pr-windows-build / call-build-windows-package (Windows 32bit, x86, x86-windows-static, 3.31.6)
- GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit (Arm64), amd64_arm64, -DCMAKE_SYSTEM_NAME=Windows -DCMA...
- GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit, x64, x64-windows-static, 3.31.6)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_ARROW=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_COVERAGE=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, gcc, g++)
- GitHub Check: pr-compile-centos-7
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, clang, clang++)
- GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-24.04, clang-14)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, gcc, g++)
- GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-22.04, clang-12)
- GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-24.04, clang-14)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, gcc, g++)
- GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-22.04, clang-12)
- GitHub Check: PR - fuzzing test
🔇 Additional comments (4)
src/multiline/flb_ml_parser.c (1)
203-216
: LGTM! Clean wrapper implementation.The legacy API wrapper correctly maps all parameters to the new parameter bag structure and delegates to the new implementation. This maintains backward compatibility while leveraging the new code path.
include/fluent-bit/multiline/flb_ml_parser.h (3)
26-27
: Good use of forward declaration.The forward declaration avoids circular dependencies and reduces compilation dependencies.
34-51
: Well-designed parameter structure with versioning support.The structure design with size field for versioning and comprehensive documentation is excellent. This allows for future extensibility while maintaining backward compatibility.
49-50
: Reserved flags field for future use.Good practice to reserve a flags field for future feature toggles without breaking ABI compatibility.
ml_parser->name = flb_sds_create(p->name); | ||
ml_parser->type = p->type; | ||
if (!ml_parser->name) { | ||
flb_free(ml_parser); | ||
return NULL; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Inconsistent error cleanup pattern.
When flb_sds_create(p->name)
fails, the function directly frees ml_parser
. However, in subsequent error paths (lines 71-74, 82-84), it calls flb_ml_parser_destroy()
for cleanup. This inconsistency could lead to maintenance issues.
Either use flb_ml_parser_destroy()
consistently for all cleanup paths after the name is set, or directly free resources consistently. The destroy function already handles NULL pointers safely.
ml_parser->name = flb_sds_create(p->name);
- ml_parser->type = p->type;
if (!ml_parser->name) {
flb_free(ml_parser);
return NULL;
}
+ ml_parser->type = p->type;
Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In src/multiline/flb_ml_parser.c around lines 60-65, the error cleanup is
inconsistent: when flb_sds_create(p->name) fails the code calls
flb_free(ml_parser) but later error paths call flb_ml_parser_destroy(), which is
the agreed safe/complete cleanup. Replace the direct flb_free(ml_parser) with a
call to flb_ml_parser_destroy(ml_parser) (then return NULL) so all post-name-set
failure paths use the same destroy routine that safely handles NULLs and frees
all allocated resources.
mk_list_init(&ml_parser->regex_rules); | ||
mk_list_add(&ml_parser->_head, &ctx->multiline_parsers); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Potential use-after-free if list operations fail.
The parser is added to the global list before all initialization is complete. If subsequent initialization fails (lines 96-116), flb_ml_parser_destroy()
will be called which removes it from the list (line 366). However, if the list operations themselves fail or if there's concurrent access, this could lead to issues.
Consider moving the list addition to the end of the function after all allocations succeed:
- /* prepare rules list & link into registry */
+ /* prepare rules list */
mk_list_init(&ml_parser->regex_rules);
- mk_list_add(&ml_parser->_head, &ctx->multiline_parsers);
/* optional keys */
if (p->key_content) {
ml_parser->key_content = flb_sds_create(p->key_content);
if (!ml_parser->key_content) {
flb_ml_parser_destroy(ml_parser);
return NULL;
}
}
// ... other key initializations ...
/* keep back-pointer to config for later rule init */
ml_parser->config = ctx;
+
+ /* link into registry after all initialization succeeds */
+ mk_list_add(&ml_parser->_head, &ctx->multiline_parsers);
return ml_parser;
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
mk_list_init(&ml_parser->regex_rules); | |
mk_list_add(&ml_parser->_head, &ctx->multiline_parsers); | |
/* prepare rules list */ | |
mk_list_init(&ml_parser->regex_rules); | |
/* optional keys */ | |
if (p->key_content) { | |
ml_parser->key_content = flb_sds_create(p->key_content); | |
if (!ml_parser->key_content) { | |
flb_ml_parser_destroy(ml_parser); | |
return NULL; | |
} | |
} | |
// ... other key initializations ... | |
/* keep back-pointer to config for later rule init */ | |
ml_parser->config = ctx; | |
/* link into registry after all initialization succeeds */ | |
mk_list_add(&ml_parser->_head, &ctx->multiline_parsers); | |
return ml_parser; |
🤖 Prompt for AI Agents
In src/multiline/flb_ml_parser.c around lines 92-93, the parser is being added
to the global ctx->multiline_parsers list before the rest of initialization
completes, which risks use-after-free if later init fails or list ops fail; move
the mk_list_add(&ml_parser->_head, &ctx->multiline_parsers) call to the end of
flb_ml_parser_create after all allocations and initializations (lines ~96-116)
have succeeded, and keep mk_list_init(&ml_parser->regex_rules) where it is;
additionally, if the codebase uses concurrent access, acquire the appropriate
mutex/lock around the list insertion or verify the list functions' return status
and on any subsequent failure perform a safe removal and cleanup path that
checks list membership before calling flb_ml_parser_destroy().
18dc299
to
ee3ff12
Compare
Signed-off-by: Hiroshi Hatake <[email protected]>
ee3ff12
to
3384321
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
src/multiline/flb_ml_parser.c (1)
63-69
: Fix unsafe destroy before list insertion; defer list linking; unify cleanup.Calling flb_ml_parser_destroy() before mk_list_add() invokes mk_list_del() on an uninitialized node, risking memory corruption. Also, error paths mix manual free and destroy. Defer registry insertion until all allocations succeed and use a single fail path.
Apply these diffs:
- Use a single fail path instead of early frees/destroy:
@@ - ml_parser->name = flb_sds_create(p->name); - ml_parser->type = p->type; - if (!ml_parser->name) { - flb_free(ml_parser); - return NULL; - } + ml_parser->name = flb_sds_create(p->name); + ml_parser->type = p->type; + if (!ml_parser->name) { + goto fail; + } @@ - ml_parser->match_str = flb_sds_create(p->match_str); - if (!ml_parser->match_str) { - flb_sds_destroy(ml_parser->name); - flb_free(ml_parser); - return NULL; - } + ml_parser->match_str = flb_sds_create(p->match_str); + if (!ml_parser->match_str) { + goto fail; + } @@ - ml_parser->parser_name = flb_sds_create(p->parser_name); - if (!ml_parser->parser_name) { - flb_ml_parser_destroy(ml_parser); - return NULL; - } + ml_parser->parser_name = flb_sds_create(p->parser_name); + if (!ml_parser->parser_name) { + goto fail; + }
- Don’t link into the registry until success:
@@ - /* prepare rules list & link into registry */ - mk_list_init(&ml_parser->regex_rules); - mk_list_add(&ml_parser->_head, &ctx->multiline_parsers); + /* prepare rules list (linking deferred until success) */ + mk_list_init(&ml_parser->regex_rules);
- Route optional-key failures to the fail path:
@@ - ml_parser->key_content = flb_sds_create(p->key_content); - if (!ml_parser->key_content) { - flb_ml_parser_destroy(ml_parser); - return NULL; - } + ml_parser->key_content = flb_sds_create(p->key_content); + if (!ml_parser->key_content) { + goto fail; + } @@ - ml_parser->key_group = flb_sds_create(p->key_group); - if (!ml_parser->key_group) { - flb_ml_parser_destroy(ml_parser); - return NULL; - } + ml_parser->key_group = flb_sds_create(p->key_group); + if (!ml_parser->key_group) { + goto fail; + } @@ - ml_parser->key_pattern = flb_sds_create(p->key_pattern); - if (!ml_parser->key_pattern) { - flb_ml_parser_destroy(ml_parser); - return NULL; - } + ml_parser->key_pattern = flb_sds_create(p->key_pattern); + if (!ml_parser->key_pattern) { + goto fail; + }
- Link into registry at the end, then return. Add a fail label for cleanup:
@@ - /* keep back-pointer to config for later rule init */ - ml_parser->config = ctx; - - return ml_parser; + /* keep back-pointer to config for later rule init */ + ml_parser->config = ctx; + + /* link into registry after all initialization succeeds */ + mk_list_add(&ml_parser->_head, &ctx->multiline_parsers); + return ml_parser; + +fail: + if (ml_parser) { + if (ml_parser->key_pattern) { flb_sds_destroy(ml_parser->key_pattern); } + if (ml_parser->key_group) { flb_sds_destroy(ml_parser->key_group); } + if (ml_parser->key_content) { flb_sds_destroy(ml_parser->key_content); } + if (ml_parser->parser_name) { flb_sds_destroy(ml_parser->parser_name); } + if (ml_parser->match_str) { flb_sds_destroy(ml_parser->match_str); } + if (ml_parser->name) { flb_sds_destroy(ml_parser->name); } + flb_free(ml_parser); + } + return NULL;Also applies to: 83-88, 95-98, 100-120, 122-126
🧹 Nitpick comments (2)
src/multiline/flb_ml_parser.c (2)
31-45
: Avoid const-cast; prefer const-correct param fields.p.name is assigned by casting away const. Consider making all string fields in flb_ml_parser_params const char* to reflect read-only intent and drop the cast here. If changing the public struct is too wide for this PR, at least document that inputs must outlive creation and won’t be modified.
51-55
: Size-gate uses hard-coded width; use the actual field’s sizeof.Replace sizeof(uint32_t) with sizeof(p->flags) to avoid mismatch if the flags type changes.
Apply this diff:
- size_t min = offsetof(struct flb_ml_parser_params, flags) + sizeof(uint32_t); + size_t min = offsetof(struct flb_ml_parser_params, flags) + sizeof(p->flags);
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
include/fluent-bit/multiline/flb_ml_parser.h
(1 hunks)src/multiline/flb_ml_parser.c
(3 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- include/fluent-bit/multiline/flb_ml_parser.h
🧰 Additional context used
🧬 Code graph analysis (1)
src/multiline/flb_ml_parser.c (2)
include/fluent-bit/flb_mem.h (2)
flb_calloc
(84-96)flb_free
(126-128)src/flb_sds.c (2)
flb_sds_create
(78-90)flb_sds_destroy
(389-399)
🔇 Additional comments (3)
src/multiline/flb_ml_parser.c (3)
23-23
: LGTM: required header include.Including flb_ml_parser.h is appropriate for the new params-based API.
28-30
: LGTM: standard headers for offsetof/memset.stddef.h and string.h are correctly added for offsetof() and memset().
196-221
: Wrapper mapping looks correct.Legacy flb_ml_parser_create() correctly maps arguments into params and delegates. No functional issues spotted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (2)
src/multiline/flb_ml_parser.c (2)
84-88
: Bug: destroy called before list linkage can corrupt the list.flb_ml_parser_destroy() unconditionally mk_list_del(&ml_parser->_head). At this point _head hasn’t been linked/initialized, so del may dereference garbage.
Minimal fix options:
- Initialize _head right after allocation so del is harmless pre-link, and/or
- Move mk_list_add to after successful initialization and avoid calling destroy before linking (use local cleanup).
See consolidated patch below.
57-61
: Make cleanup safe + consistent; link after full init.
- Initialize _head immediately to make destructor safe pre-link.
- Replace early frees with flb_ml_parser_destroy for consistency.
- Defer registry linkage until the end to avoid partially-initialized entries.
@@ ml_parser = flb_calloc(1, sizeof(struct flb_ml_parser)); if (!ml_parser) { flb_errno(); return NULL; } + /* Make destructor safe before linking */ + mk_list_init(&ml_parser->_head); @@ - if (!ml_parser->name) { - flb_free(ml_parser); - return NULL; - } + if (!ml_parser->name) { + flb_ml_parser_destroy(ml_parser); + return NULL; + } @@ - if (!ml_parser->match_str) { - flb_sds_destroy(ml_parser->name); - flb_free(ml_parser); - return NULL; - } + if (!ml_parser->match_str) { + flb_ml_parser_destroy(ml_parser); + return NULL; + } @@ - /* prepare rules list & link into registry */ - mk_list_init(&ml_parser->regex_rules); - mk_list_add(&ml_parser->_head, &ctx->multiline_parsers); + /* prepare rules list */ + mk_list_init(&ml_parser->regex_rules); @@ /* keep back-pointer to config for later rule init */ ml_parser->config = ctx; + /* link into registry after all allocations succeed */ + mk_list_add(&ml_parser->_head, &ctx->multiline_parsers); + return ml_parser;Also applies to: 66-69, 75-78, 95-98, 122-125
🧹 Nitpick comments (2)
include/fluent-bit/multiline/flb_ml_parser.h (1)
34-51
: Future-proof the parameter bag: widen size field and add const-correctness.
- Using uint16_t for size risks truncation as the struct evolves; prefer uint32_t.
- Input pointers should be const char * to reflect read-only semantics and avoid casts.
Apply:
-struct flb_ml_parser_params { - uint16_t size; /* sizeof(struct flb_ml_parser_params) */ +struct flb_ml_parser_params { + uint32_t size; /* sizeof(struct flb_ml_parser_params) */ @@ - char *name; + const char *name; @@ - char *match_str; /* used for ENDSWITH/EQ; NULL for REGEX */ + const char *match_str; /* used for ENDSWITH/EQ; NULL for REGEX */ @@ - char *key_content; - char *key_group; - char *key_pattern; + const char *key_content; + const char *key_group; + const char *key_pattern; @@ - char *parser_name; /* delayed init */ + const char *parser_name; /* delayed init */Also ensure <stdint.h> is included in this header (either directly here or via a prerequisite header).
src/multiline/flb_ml_parser.c (1)
31-45
: Defaults helper looks sane.Tiny nit if you adopt const-correctness: drop the cast on name.
- p.name = (char *) name; + p.name = name;
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
include/fluent-bit/multiline/flb_ml_parser.h
(1 hunks)src/multiline/flb_ml_parser.c
(3 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
include/fluent-bit/multiline/flb_ml_parser.h (1)
src/multiline/flb_ml_parser.c (2)
flb_ml_parser_params_default
(32-44)flb_ml_parser_create_params
(47-126)
src/multiline/flb_ml_parser.c (2)
include/fluent-bit/flb_mem.h (2)
flb_calloc
(84-96)flb_free
(126-128)src/flb_sds.c (2)
flb_sds_create
(78-90)flb_sds_destroy
(389-399)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (25)
- GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit, x64, x64-windows-static, 3.31.6)
- GitHub Check: pr-windows-build / call-build-windows-package (Windows 32bit, x86, x86-windows-static, 3.31.6)
- GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit (Arm64), amd64_arm64, -DCMAKE_SYSTEM_NAME=Windows -DCMA...
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_ARROW=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_COVERAGE=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, clang, clang++)
- GitHub Check: PR - fuzzing test
- GitHub Check: pr-compile-centos-7
🔇 Additional comments (5)
include/fluent-bit/multiline/flb_ml_parser.h (2)
26-28
: Forward decl looks good.Prevents header cycles and keeps the public surface minimal.
53-59
: New APIs are clear and composable.Default helper + params-based creator provide a clean v2 surface.
src/multiline/flb_ml_parser.c (3)
23-23
: Correct header include.Brings the new public APIs into scope.
28-30
: Needed C headers included.Required for offsetof and memset.
196-221
: Legacy wrapper delegation LGTM.Thin mapping to params keeps old API behavior intact.
We added an interface for configurable buffer limit for multiline.
Also, we implemented robust processing for multiline concatenations.
Enter
[N/A]
in the box, if an item is not applicable to your change.Testing
Before we can approve your change; please submit the following in a comment:
If this is a change to packaging of containers or native binaries then please confirm it works for all targets.
ok-package-test
label to test for all targets (requires maintainer to do).Documentation
Backporting
Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.
Summary by CodeRabbit
New Features
Bug Fixes
Tests