-
Notifications
You must be signed in to change notification settings - Fork 1.9k
in_splunk: Implement handling remote addr feature #11398
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
📝 WalkthroughWalkthroughThis change adds X-Forwarded-For (XFF) header extraction and remote address injection to the Splunk input plugin. New configuration options enable capturing remote addresses from XFF headers or connection peer addresses, with configurable field names for storing the data in log records. Changes
Sequence DiagramsequenceDiagram
actor Client
participant Handler as Protocol Handler
participant Parser as Header Parser
participant Logger as Log Processor
participant Output as Output
Client->>Handler: HTTP Request + X-Forwarded-For Header
Handler->>Parser: Extract XFF Header / Connection Address
Parser->>Parser: Parse X-Forwarded-For Value
Parser->>Handler: Return Remote Address
Handler->>Handler: Store current_remote_addr
Handler->>Logger: Process Payload (with remote_addr context)
Logger->>Logger: Append Remote Address Field to Record
Logger->>Output: Emit Log with Remote Address
Handler->>Handler: Clear current_remote_addr (keep-alive cleanup)
Estimated Code Review Effort🎯 3 (Moderate) | ⏱️ ~35 minutes Suggested Labels
Suggested Reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ba50be0c16
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| /* Always build body by appending map entries so we can extend it */ | ||
| if (ret == FLB_EVENT_ENCODER_SUCCESS) { | ||
| kv = record->via.map.ptr; | ||
| for (i = 0; i < record->via.map.size && | ||
| ret == FLB_EVENT_ENCODER_SUCCESS; i++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handle non-map records before iterating keys
The new body-building path always treats record as a map and iterates record->via.map without checking record->type. This is a regression from the prior set_body_from_msgpack_object() path, which safely handled non-map records. If a Splunk payload is a JSON array containing scalar entries (or any non-object element), process_json_payload_pack() still forwards each element here, and this dereference will read the wrong union field, leading to undefined behavior or crashes. Consider keeping the old fallback for non-map objects or validating the record type before iterating.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@plugins/in_splunk/splunk_prot.c`:
- Around line 1381-1399: In splunk_prot_handle_ng() the per-request fields
context->current_remote_addr and context->current_remote_addr_len are set but
never reset, allowing stale addresses to persist across requests; update the
function to ensure these fields are cleared before every return (or funnel
returns through a single cleanup label), e.g., after using
extract_remote_address(), when falling back to peer
(flb_connection_get_remote_address(parent_session->connection)), and prior to
any early exits: set context->current_remote_addr = NULL and
context->current_remote_addr_len = 0 (or free/reset as appropriate) so the same
cleanup performed in splunk_prot_handle() is applied here.
🧹 Nitpick comments (5)
plugins/in_splunk/splunk.h (1)
76-79: Consider usingconst char *instead offlb_sds_tfor borrowed pointers.
current_remote_addris assigned non-owned pointers from either the XFF header value orflb_connection_get_remote_address()insplunk_prot.c. Usingflb_sds_tis misleading since it implies an owned/allocated string that should be managed withflb_sds_*functions.For clarity and to prevent accidental misuse:
♻️ Suggested change
/* Remote address */ - flb_sds_t current_remote_addr; + const char *current_remote_addr; size_t current_remote_addr_len;plugins/in_splunk/splunk_prot.c (4)
265-290: Const-correctness issue in output parameter.The function assigns
const char *values (fromextract_xff_valueandflb_connection_get_remote_address) to*out, butoutis declared aschar **. This discards theconstqualifier and may cause compiler warnings.♻️ Suggested fix
static int extract_remote_address(const char *xff_value, size_t xff_value_len, struct flb_connection *connection, - char **out, + const char **out, size_t *out_len) {Also update the corresponding field type in
splunk.hand call sites insplunk_prot_handle()andsplunk_prot_handle_ng().
424-428: Unused parameters in function signature.The
remote_addrandremote_addr_lenparameters are added to the signature but never used. The function usesctx->current_remote_addrandctx->current_remote_addr_lendirectly at lines 478-480.Either use the passed parameters or remove them from the signature to avoid confusion:
♻️ Option 1: Remove unused parameters
static void process_flb_log_append(struct flb_splunk *ctx, msgpack_object *record, flb_sds_t tag, flb_sds_t tag_from_record, - struct flb_time tm, - const char *remote_addr, - size_t remote_addr_len) + struct flb_time tm)♻️ Option 2: Use the passed parameters
if (ret == FLB_EVENT_ENCODER_SUCCESS) { ret = append_remote_addr(ctx, - ctx->current_remote_addr, - ctx->current_remote_addr_len); + remote_addr, + remote_addr_len); }
775-780: Unused parameters in function signature.Similar to
process_flb_log_append(), theremote_addrandremote_addr_lenparameters are not used within this function. The downstreamprocess_raw_payload_pack()accessesctx->current_remote_addrdirectly.Consider removing these unused parameters for consistency:
♻️ Suggested change
static int process_hec_raw_payload(struct flb_splunk *ctx, struct splunk_conn *conn, flb_sds_t tag, struct mk_http_session *session, - struct mk_http_request *request, - const char *remote_addr, - size_t remote_addr_len) + struct mk_http_request *request)
1115-1118: Missing cleanup on early return paths.The per-request remote address is cleared at the end of successful processing, but multiple early return paths (lines 861, 928, 974, 1040, 1066, 1088, 1104) skip this cleanup. While the state is re-initialized at the start of each request (lines 1003-1004), for defensive coding it would be cleaner to use a
goto cleanuppattern to ensure consistent cleanup.Alternatively, since the state is always re-initialized at the start of
splunk_prot_handle(), this might be acceptable as-is. Just ensure this initialization always happens before any potential use.
| /* Resolve per-request remote address */ | ||
| context->current_remote_addr = NULL; | ||
| context->current_remote_addr_len = 0; | ||
|
|
||
| parent_session = (struct flb_http_server_session *) request->stream->parent; | ||
| if (http_header_lookup(HTTP_PROTOCOL_VERSION_20, request, | ||
| SPLUNK_XFF_HEADER, &hval, &hlen) == 0) { | ||
| extract_remote_address(hval, hlen, parent_session->connection, | ||
| &context->current_remote_addr, | ||
| &context->current_remote_addr_len); | ||
| } | ||
| else { | ||
| /* fallback to peer addr */ | ||
| peer = flb_connection_get_remote_address(parent_session->connection); | ||
| if (peer) { | ||
| context->current_remote_addr = peer; | ||
| context->current_remote_addr_len = strlen(peer); | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing cleanup of per-request remote address in splunk_prot_handle_ng().
Unlike splunk_prot_handle() which clears current_remote_addr at lines 1115-1118, splunk_prot_handle_ng() never clears the per-request state after processing. This could cause stale remote address values to leak to subsequent requests on the same HTTP/2 stream or connection.
♻️ Suggested fix - add cleanup before return
Add cleanup before each return statement, or consolidate returns through a cleanup label:
flb_sds_destroy(tag);
+
+ /* Clear per-request remote address */
+ context->current_remote_addr = NULL;
+ context->current_remote_addr_len = 0;
+
return ret;
}🤖 Prompt for AI Agents
In `@plugins/in_splunk/splunk_prot.c` around lines 1381 - 1399, In
splunk_prot_handle_ng() the per-request fields context->current_remote_addr and
context->current_remote_addr_len are set but never reset, allowing stale
addresses to persist across requests; update the function to ensure these fields
are cleared before every return (or funnel returns through a single cleanup
label), e.g., after using extract_remote_address(), when falling back to peer
(flb_connection_get_remote_address(parent_session->connection)), and prior to
any early exits: set context->current_remote_addr = NULL and
context->current_remote_addr_len = 0 (or free/reset as appropriate) so the same
cleanup performed in splunk_prot_handle() is applied here.
Currently, in_splunk does not handle remote address.
This could be inconvenient to track remote address for traceability.
Enter
[N/A]in the box, if an item is not applicable to your change.Testing
Before we can approve your change; please submit the following in a comment:
If this is a change to packaging of containers or native binaries then please confirm it works for all targets.
ok-package-testlabel to test for all targets (requires maintainer to do).Documentation
Backporting
Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.
Summary by CodeRabbit
New Features
Tests
✏️ Tip: You can customize this high-level summary in your review settings.