-
Notifications
You must be signed in to change notification settings - Fork 1.8k
in_tail: fix last_processed_bytes calculation #10677
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
WalkthroughAdjusts tail plugin byte-tracking (per-line assignment and per-chunk reset), adds a multiline regex parser config, and introduces a test verifying offset_key behavior with multiline parsing in the tail input. Changes
Sequence Diagram(s)sequenceDiagram
participant TestRunner
participant FluentBit
participant Tail
participant Parser
participant Output
Note over TestRunner,FluentBit: Prepare file with two multiline entries
TestRunner->>FluentBit: start engine (tail input, parsers_multiline.conf, offset_key)
FluentBit->>Tail: initialize file state
Tail->>Parser: read chunk / lines
Parser-->>Tail: return parsed multiline record + processed_bytes
Tail->>Tail: set file.last_processed_bytes = processed_bytes
Tail->>Output: emit record (includes computed offset_key)
Note over Tail: after chunk processed => file.last_processed_bytes = 0 and stream_offset updated
TestRunner->>Tail: append another log line
Tail->>Parser: parse new line
Parser-->>Tail: emit subsequent record
Output->>TestRunner: records available for assertion
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
Tip 🔌 Remote MCP (Model Context Protocol) integration is now available!Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats. ✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (2)
tests/runtime/in_tail.c (2)
1417-1421
: Use size_t-compatible format specifier for strlen to avoid UB on some platformsstrlen returns size_t; printing with %ld is not portable. Prefer %zu and drop the unnecessary address-of on expected_msg.
- ret = snprintf(&expected_msg[0], sizeof(expected_msg), "\"%s\":%ld", offset_key, strlen(msg_before_tail)+strlen(NEW_LINE)+strlen(msg_before_tail2)+strlen(NEW_LINE)); + ret = snprintf(expected_msg, sizeof(expected_msg), "\"%s\":%zu", + offset_key, + strlen(msg_before_tail) + strlen(NEW_LINE) + + strlen(msg_before_tail2) + strlen(NEW_LINE));
2432-2432
: Optionally gate this test under FLB_HAVE_REGEX like other regex-dependent testsThis test relies on a type=regex multiline parser. To mirror how “parser” and “tag_regex” are guarded, consider gating registration to avoid failures when regex is disabled.
If preferred, wrap this entry:
- {"multiline_offset_key", flb_test_multiline_offset_key}, + #ifdef FLB_HAVE_REGEX + {"multiline_offset_key", flb_test_multiline_offset_key}, + #endifConfirm whether CI includes builds without regex support. If so, the guard is advisable.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (3)
plugins/in_tail/tail_file.c
(1 hunks)tests/runtime/data/tail/parsers_multiline.conf
(1 hunks)tests/runtime/in_tail.c
(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- plugins/in_tail/tail_file.c
🧰 Additional context used
🧬 Code Graph Analysis (1)
tests/runtime/in_tail.c (1)
src/flb_lib.c (4)
flb_service_set
(647-673)flb_input_set
(300-330)flb_output_set
(515-546)flb_start
(914-925)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (28)
- GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit (Arm64), amd64_arm64, -DCMAKE_SYSTEM_NAME=Windows -DCMA...
- GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit, x64, x64-windows-static, 3.31.6)
- GitHub Check: pr-windows-build / call-build-windows-package (Windows 32bit, x86, x86-windows-static, 3.31.6)
- GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-22.04, clang-12)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, gcc, g++)
- GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-24.04, clang-14)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, clang, clang++)
- GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-24.04, clang-14)
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, gcc, g++)
- GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-22.04, clang-12)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_COVERAGE=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, clang, clang++)
- GitHub Check: pr-compile-centos-7
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, clang, clang++)
- GitHub Check: PR - fuzzing test
Fix last_processed_bytes calculation in multiple line scenario. Signed-off-by: zshuang0316 <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
tests/runtime/data/tail/parsers_multiline.conf (1)
5-6
: Optional: make the parser more reusable by accepting both space/T and comma/dot for sub-second separatorIf you want this parser to work with similar logs that use ISO 8601 “T” and/or a dot for milliseconds, you can broaden the regex while keeping current behavior intact.
- rule "start_state" "/^\[\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2},\d{3}\]/" "cont" - rule "cont" "/^(?!\[\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2},\d{3}\]).*/" "cont" + rule "start_state" "/^\[\d{4}-\d{2}-\d{2}[ T]\d{2}:\d{2}:\d{2}[,.]\d{3}\]/" "cont" + rule "cont" "/^(?!\[\d{4}-\d{2}-\d{2}[ T]\d{2}:\d{2}:\d{2}[,.]\d{3}\]).*/" "cont"
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (3)
plugins/in_tail/tail_file.c
(2 hunks)tests/runtime/data/tail/parsers_multiline.conf
(1 hunks)tests/runtime/in_tail.c
(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- tests/runtime/in_tail.c
- plugins/in_tail/tail_file.c
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (28)
- GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit, x64, x64-windows-static, 3.31.6)
- GitHub Check: pr-windows-build / call-build-windows-package (Windows 32bit, x86, x86-windows-static, 3.31.6)
- GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit (Arm64), amd64_arm64, -DCMAKE_SYSTEM_NAME=Windows -DCMA...
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_COVERAGE=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, clang, clang++)
- GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-24.04, clang-14)
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, gcc, g++)
- GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-22.04, clang-12)
- GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-24.04, clang-14)
- GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-22.04, clang-12)
- GitHub Check: pr-compile-centos-7
- GitHub Check: PR - fuzzing test
🔇 Additional comments (3)
tests/runtime/data/tail/parsers_multiline.conf (3)
1-3
: Parser block and naming look correctSection header, name, and type are valid for a Fluent Bit multiline regex parser.
5-6
: Regex now matches the bracketed timestamp format used in testsThis addresses the earlier mismatch (T vs space, unescaped dot). Anchoring to the opening bracket and using a negative lookahead for continuation lines is correct and should make the parser robust and non-flaky for the provided log format.
4-4
: flush_timeout(2000 ms) is safe under the existing 5 s test waitThe multiline parser defined in
• tests/runtime/data/tail/parsers_multiline.conf (flush_timeout 2000)
is exercised by
• tests/runtime/in_tail.c (lines 1466–1470) via
wait_num_with_timeout(5000, &num)Since the test harness waits up to 5000 ms for output, a 2000 ms flush timeout will always fire well before the timeout window. No change is required.
Updated, please help review, thanks |
This value represent the number of bytes processed by process_content() in the last iteration so we can set it the current processed_bytes directly.
Enter
[N/A]
in the box, if an item is not applicable to your change.Testing
Before we can approve your change; please submit the following in a comment:
If this is a change to packaging of containers or native binaries then please confirm it works for all targets.
ok-package-test
label to test for all targets (requires maintainer to do).Documentation
Backporting
Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.
Summary by CodeRabbit
New Features
Bug Fixes
Tests