Add helper methods to support kafka output in beatreceivers#49768
Add helper methods to support kafka output in beatreceivers#49768khushijain21 wants to merge 4 commits intoelastic:mainfrom
Conversation
🤖 GitHub commentsJust comment with:
|
|
This pull request does not have a backport label.
To fixup this pull request, you need to add the backport labels for the needed
|
|
Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane) |
📝 WalkthroughWalkthroughThe libbeat format string package exports three previously internal utilities. In ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@libbeat/common/fmtstr/formatstring.go`:
- Around line 340-365: ParseRawTokens can return early and leave the lexer
producer goroutine blocked; ensure we always drain lex.Tokens() on any early
exit by adding a deferred drain. At the top of ParseRawTokens, add a defer that
iterates over lex.Tokens() and discards remaining tokens (for range lex.Tokens()
{ }) unless the function completes normally; mark normal completion (e.g., a
local done flag set to true just before the final return) so the deferred drain
only runs on error/early returns. This ensures any early returns from cases like
tokErr or parseVariableToken failures do not leak the goroutine started by
MakeLexer while keeping normal behavior for the existing token handling
(tokString, tokOpen, tokClose, tokOperator) and existing use of
parseVariableToken and VariableToken.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 6a04693a-87bf-405a-8074-0a210fd48f5c
📒 Files selected for processing (3)
libbeat/common/fmtstr/formatevents.golibbeat/common/fmtstr/formatstring.golibbeat/common/fmtstr/formatstring_test.go
| func ParseRawTokens(lex lexer) ([]any, error) { | ||
| var elems []any | ||
|
|
||
| for token := range lex.Tokens() { | ||
| switch token.typ { | ||
| case tokErr: | ||
| return nil, errors.New(token.val) | ||
|
|
||
| case tokString: | ||
| elems = append(elems, token.val) | ||
|
|
||
| case tokOpen: | ||
| elem, err := parseVariableToken(lex) | ||
| if err != nil { | ||
| return nil, err | ||
| } | ||
| elems = append(elems, VariableToken(elem)) | ||
|
|
||
| case tokClose, tokOperator: | ||
| // should not happen, but let's return error just in case | ||
| return nil, fmt.Errorf("Token '%v'(%v) not allowed", token.val, token.typ) | ||
| } | ||
| } | ||
|
|
||
| return elems, nil | ||
| } |
There was a problem hiding this comment.
Drain the lexer on all ParseRawTokens exits to prevent goroutine leaks.
At Line 340, ParseRawTokens can return early on malformed input (e.g., nested variable from Line 382) without draining lex. Since MakeLexer starts a producer goroutine, this can leave it blocked if callers forget Finish().
Proposed fix
func ParseRawTokens(lex lexer) ([]any, error) {
+ defer lex.Finish()
var elems []any
for token := range lex.Tokens() {
switch token.typ {
case tokErr:
return nil, errors.New(token.val)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| func ParseRawTokens(lex lexer) ([]any, error) { | |
| var elems []any | |
| for token := range lex.Tokens() { | |
| switch token.typ { | |
| case tokErr: | |
| return nil, errors.New(token.val) | |
| case tokString: | |
| elems = append(elems, token.val) | |
| case tokOpen: | |
| elem, err := parseVariableToken(lex) | |
| if err != nil { | |
| return nil, err | |
| } | |
| elems = append(elems, VariableToken(elem)) | |
| case tokClose, tokOperator: | |
| // should not happen, but let's return error just in case | |
| return nil, fmt.Errorf("Token '%v'(%v) not allowed", token.val, token.typ) | |
| } | |
| } | |
| return elems, nil | |
| } | |
| func ParseRawTokens(lex lexer) ([]any, error) { | |
| defer lex.Finish() | |
| var elems []any | |
| for token := range lex.Tokens() { | |
| switch token.typ { | |
| case tokErr: | |
| return nil, errors.New(token.val) | |
| case tokString: | |
| elems = append(elems, token.val) | |
| case tokOpen: | |
| elem, err := parseVariableToken(lex) | |
| if err != nil { | |
| return nil, err | |
| } | |
| elems = append(elems, VariableToken(elem)) | |
| case tokClose, tokOperator: | |
| // should not happen, but let's return error just in case | |
| return nil, fmt.Errorf("Token '%v'(%v) not allowed", token.val, token.typ) | |
| } | |
| } | |
| return elems, nil | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@libbeat/common/fmtstr/formatstring.go` around lines 340 - 365, ParseRawTokens
can return early and leave the lexer producer goroutine blocked; ensure we
always drain lex.Tokens() on any early exit by adding a deferred drain. At the
top of ParseRawTokens, add a defer that iterates over lex.Tokens() and discards
remaining tokens (for range lex.Tokens() { }) unless the function completes
normally; mark normal completion (e.g., a local done flag set to true just
before the final return) so the deferred drain only runs on error/early returns.
This ensures any early returns from cases like tokErr or parseVariableToken
failures do not leak the goroutine started by MakeLexer while keeping normal
behavior for the existing token handling (tokString, tokOpen, tokClose,
tokOperator) and existing use of parseVariableToken and VariableToken.
| name: "with escaped % symbol", | ||
| input: `\%{abc}`, | ||
| expectedList: []any{`%{abc}`}, | ||
| }, |
There was a problem hiding this comment.
I think we are missing some testcases here, for example what should happen with these cases:
""100% literal%%%%{}%\{}%{}%{a:b:c}%{a:b:?c}%{a
There was a problem hiding this comment.
Also, maybe a fuzz test is a good idea.
belimawr
left a comment
There was a problem hiding this comment.
Overall LGMT, but I agree with Mauri, a more exhaustive test for ParseRawTokens would be great.
| for token := range lex.Tokens() { | ||
| switch token.typ { | ||
| case tokErr: | ||
| return nil, errors.New(token.val) | ||
|
|
||
| case tokString: | ||
| elems = append(elems, StringElement{token.val}) | ||
|
|
||
| case tokOpen: | ||
| elem, err := parseVariable(lex) | ||
| if err != nil { | ||
| return nil, err | ||
| } | ||
| elems = append(elems, elem) | ||
|
|
||
| case tokClose, tokOperator: | ||
| // should not happen, but let's return error just in case | ||
| return nil, fmt.Errorf("Token '%v'(%v) not allowed", token.val, token.typ) | ||
| } | ||
| } | ||
|
|
||
| return elems, nil | ||
| } | ||
|
|
||
| func parseVariable(lex lexer) (formatElement, error) { | ||
| var strings []string | ||
| var ops []string | ||
|
|
||
| for token := range lex.Tokens() { | ||
| switch token.typ { | ||
| case tokErr: | ||
| return nil, errors.New(token.val) | ||
|
|
||
| case tokOpen: | ||
| return nil, errNestedVar | ||
|
|
||
| case tokClose: | ||
| if len(strings) == 0 { | ||
| return nil, errEmptyFormat | ||
| } | ||
| return makeVariableElement(strings[0], ops, strings[1:]) | ||
|
|
||
| case tokString: | ||
| if len(strings) != len(ops) { | ||
| return nil, fmt.Errorf("Unexpected string token %v, expected operator", token.val) | ||
| } | ||
| strings = append(strings, token.val) | ||
|
|
||
| case tokOperator: | ||
| if len(strings) == 0 { | ||
| return nil, errUnexpectedOperator | ||
| } | ||
| ops = append(ops, token.val) | ||
| if len(ops) > len(strings) { | ||
| return nil, fmt.Errorf("Consecutive operator tokens '%v'", token.val) | ||
| } | ||
|
|
||
| default: | ||
| return nil, fmt.Errorf("Unexpected token '%v' (%v)", token.val, token.typ) | ||
| } | ||
| } | ||
|
|
||
| return nil, errMissingClose | ||
| } | ||
|
|
||
| func makeLexer(in string) lexer { | ||
| type VariableToken string | ||
|
|
||
| // ParseRawTokens walks the lexer output and returns a slice in order: plain | ||
| // strings for literal segments and VariableToken for each %{...} block. | ||
| func ParseRawTokens(lex lexer) ([]any, error) { |
There was a problem hiding this comment.
This is the exact same as the private parse except with the return type changed? Is there no way to re-use the existing code? Making the function generic on the type?
Proposed commit message
This PR adds helper methods on
fmtstrpackage to help parse dynamic fields. This is required to support kafka output on beatreceiversChecklist
stresstest.shscript to run them under stress conditions and race detector to verify their stability../changelog/fragmentsusing the changelog tool.Disruptive User Impact
None
Related issues