Skip to content

perf: skip unnecessary event.Clone() in processors for single-field ops#49777

Open
strawgate wants to merge 1 commit intoelastic:mainfrom
strawgate:perf/skip-unnecessary-clone
Open

perf: skip unnecessary event.Clone() in processors for single-field ops#49777
strawgate wants to merge 1 commit intoelastic:mainfrom
strawgate:perf/skip-unnecessary-clone

Conversation

@strawgate
Copy link
Copy Markdown
Contributor

@strawgate strawgate commented Mar 30, 2026

Summary

Skip unnecessary event.Clone() in 11 processors. The clone deep-copies the entire event for rollback on error, but in most cases there is nothing to roll back — the operation either fails before writing or performs a single atomic write.

This brings all processors in line with convert, which has had this optimization since 2019 (#11686).

Changes

dissect — Restructure mapper() to validate all target keys before writing any values. If a conflict is found, the event is returned unmodified. This eliminates event.Clone() entirely regardless of configuration. Dissect clones the full accumulated event (including all add_fields/host/cloud metadata) on every call, making this the highest-impact change.

rename, copy_fields, replace, truncate_fields, urldecode, extract_array, decode_csv_fields — Skip clone when len(fields) == 1. A single-field operation can't partially fail. For rename, path overlap detection (a → a.b) preserves the clone when needed.

decode_base64_field, decompress_gzip_field, append — Remove clone entirely (always single operation).

Behavioral safety

  • dissect: Check-then-write. All key conflicts detected before any mutation. No partial writes possible.
  • Single-field processors: One operation = no partial state to roll back. Either it fails cleanly or succeeds atomically.
  • rename path overlap: renameNeedsClone() detects when from/to share a path prefix and preserves the clone.
  • Multi-field configs: Clone is preserved when len(fields) > 1.
  • The only processor with event.Clone() NOT changed is alterFieldProcessor (in-place key mutation).

Per-processor benchmarks

Processor Δ ns/op Δ allocs/op
rename (1 field) -60% -48%
copy_fields (1 field) -61% -48%

End-to-end filebeat benchmarks

GCE e2-standard-2, Docker cgroups, benchmark input → mock ES, 30s, 3 alternating runs:

Scenario Baseline EPS PR EPS Δ
0.5 CPU 3,168 3,598 +13%
1.0 CPU 5,912 7,005 +18%

Test plan

  • All existing tests pass for all 11 affected processor packages
  • New tests for rename clone-skip: single field, overlapping paths, multi-field rollback
  • New tests for copy_fields clone-skip
  • E2E filebeat benchmarks across CPU and memory configurations

🤖 Generated with Claude Code

@strawgate strawgate requested a review from a team as a code owner March 30, 2026 13:22
@strawgate strawgate requested review from faec and khushijain21 March 30, 2026 13:22
@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Mar 30, 2026
@github-actions
Copy link
Copy Markdown
Contributor

🤖 GitHub comments

Just comment with:

  • run docs-build : Re-trigger the docs validation. (use unformatted text in the comment!)

@strawgate strawgate changed the title perf: skip unnecessary event.Clone() in processors for single-field o… perf: skip unnecessary event.Clone() in processors for single-field ops Mar 30, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 30, 2026

📝 Walkthrough

Walkthrough

This pull request reduces unnecessary event cloning across libbeat processors. Processors that previously cloned whenever fail_on_error is enabled now generally clone only when fail_on_error is true and multiple fields/mappings exist. The rename processor adds an unexported renameNeedsClone helper to detect overlapping paths and conditionally clone. The dissect processor removes rollback cloning and switches to a preflight key-existence scan for conflict detection. Tests for clone-skip behavior were added for rename and copy processors, and a changelog fragment documents the change.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • 🛠️ Update Documentation: Commit on current branch
  • 🛠️ Update Documentation: Create PR

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@libbeat/processors/actions/rename.go`:
- Around line 79-80: The code deletes the source field before ensuring the
destination write succeeds, and because renameNeedsClone can return false for
single non-overlapping fields no backup is kept when f.config.FailOnError is
true; change the logic so that when f.config.FailOnError is true you always
preserve a backup (create backup = event.Clone()) before mutating, or alter
renameField to perform PutValue(to, value) first and only remove the original
(from) after PutValue succeeds, and if PutValue fails restore the backup; update
the same pattern in the other affected blocks (the similar logic around lines
90-92 and 140-151) to use the same safe sequence (backup on FailOnError or
write-then-delete) and reference f.config.FailOnError, renameNeedsClone, backup,
renameField, PutValue, from and to when making the changes.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 128dd85b-820c-4316-8e09-dc37698937ae

📥 Commits

Reviewing files that changed from the base of the PR and between e92763f and d883a3d.

📒 Files selected for processing (12)
  • libbeat/processors/actions/append.go
  • libbeat/processors/actions/clone_skip_test.go
  • libbeat/processors/actions/copy_fields.go
  • libbeat/processors/actions/decode_base64_field.go
  • libbeat/processors/actions/decompress_gzip_field.go
  • libbeat/processors/actions/rename.go
  • libbeat/processors/actions/replace.go
  • libbeat/processors/actions/truncate_fields.go
  • libbeat/processors/decode_csv_fields/decode_csv_fields.go
  • libbeat/processors/dissect/processor.go
  • libbeat/processors/extract_array/extract_array.go
  • libbeat/processors/urldecode/urldecode.go
💤 Files with no reviewable changes (3)
  • libbeat/processors/actions/decompress_gzip_field.go
  • libbeat/processors/actions/decode_base64_field.go
  • libbeat/processors/actions/append.go

Comment on lines +79 to 80
if f.config.FailOnError && renameNeedsClone(f.config) {
backup = event.Clone()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Single-field rename can drop source data on failure with fail_on_error: true

For non-overlapping single-field configs, renameNeedsClone returns false, so no backup is kept. But renameField deletes from before PutValue(to, value). If PutValue fails, the processor returns a partially mutated event (source removed), which is a rollback/data-loss regression under fail-on-error.

Proposed safe fix
func renameNeedsClone(config renameFieldsConfig) bool {
-	if len(config.Fields) > 1 {
+	if len(config.Fields) > 1 {
 		return true
 	}
 	if len(config.Fields) == 1 {
 		from := config.Fields[0].From
 		to := config.Fields[0].To
-		if strings.HasPrefix(to, from+".") || strings.HasPrefix(from, to+".") {
+		// clone required when delete-before-put ordering is required
+		// (overlapping paths), otherwise renameField can do put-before-delete.
+		if strings.HasPrefix(to, from+".") || strings.HasPrefix(from, to+".") {
 			return true
 		}
 	}
 	return false
}
func (f *renameFields) renameField(from string, to string, event *beat.Event) error {
 	_, err := event.GetValue(to)
 	if err == nil {
 		return fmt.Errorf("target field %s already exists, drop or rename this field first", to)
 	}

 	value, err := event.GetValue(from)
 	if err != nil {
 		...
 	}

-	// Deletion must happen first to support cases where a becomes a.b
-	err = event.Delete(from)
-	if err != nil {
-		return fmt.Errorf("could not delete key: %s,  %w", from, err)
-	}
-
-	_, err = event.PutValue(to, value)
-	if err != nil {
-		return fmt.Errorf("could not put value: %s: %v, %w", to, value, err)
-	}
+	overlap := strings.HasPrefix(to, from+".") || strings.HasPrefix(from, to+".")
+	if !overlap {
+		if _, err = event.PutValue(to, value); err != nil {
+			return fmt.Errorf("could not put value: %s: %v, %w", to, value, err)
+		}
+		if err = event.Delete(from); err != nil {
+			_, _ = event.Delete(to) // best-effort local rollback
+			return fmt.Errorf("could not delete key: %s, %w", from, err)
+		}
+		return nil
+	}
+
+	// overlap case still requires delete-first
+	if err = event.Delete(from); err != nil {
+		return fmt.Errorf("could not delete key: %s,  %w", from, err)
+	}
+	if _, err = event.PutValue(to, value); err != nil {
+		return fmt.Errorf("could not put value: %s: %v, %w", to, value, err)
+	}
 	return nil
}

Also applies to: 90-92, 140-151

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@libbeat/processors/actions/rename.go` around lines 79 - 80, The code deletes
the source field before ensuring the destination write succeeds, and because
renameNeedsClone can return false for single non-overlapping fields no backup is
kept when f.config.FailOnError is true; change the logic so that when
f.config.FailOnError is true you always preserve a backup (create backup =
event.Clone()) before mutating, or alter renameField to perform PutValue(to,
value) first and only remove the original (from) after PutValue succeeds, and if
PutValue fails restore the backup; update the same pattern in the other affected
blocks (the similar logic around lines 90-92 and 140-151) to use the same safe
sequence (backup on FailOnError or write-then-delete) and reference
f.config.FailOnError, renameNeedsClone, backup, renameField, PutValue, from and
to when making the changes.

@strawgate strawgate added backport-skip Skip notification from the automated backport with mergify skip-changelog labels Mar 30, 2026
strawgate added a commit to strawgate/beats that referenced this pull request Mar 30, 2026
The dissect event.Clone() skip is moving to elastic#49777 which applies
the same pattern more broadly across all action processors.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@elastic elastic deleted a comment from botelastic bot Mar 30, 2026
@elastic elastic deleted a comment from mergify bot Mar 30, 2026
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@libbeat/processors/actions/rename.go`:
- Line 86: The error string constructed in rename.go (variable errMsg created
via fmt.Errorf in the rename processor) uses a lowercase "failed to rename..."
which mismatches tests expecting "Failed to rename..."; update the fmt.Errorf
call that sets errMsg to use a capitalized "Failed to rename fields in
processor: %w" so the error message matches the assertions in rename_test.go (or
alternatively update the tests if you prefer changing expectations).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 87c29645-1654-4003-988e-bb655b87972c

📥 Commits

Reviewing files that changed from the base of the PR and between d883a3d and 6d5ad2d.

📒 Files selected for processing (6)
  • changelog/fragments/1774840000-skip-unnecessary-event-clone.yaml
  • libbeat/processors/actions/append.go
  • libbeat/processors/actions/copy_fields.go
  • libbeat/processors/actions/decompress_gzip_field.go
  • libbeat/processors/actions/rename.go
  • libbeat/processors/extract_array/extract_array.go
✅ Files skipped from review due to trivial changes (2)
  • changelog/fragments/1774840000-skip-unnecessary-event-clone.yaml
  • libbeat/processors/actions/append.go
🚧 Files skipped from review as they are similar to previous changes (2)
  • libbeat/processors/actions/decompress_gzip_field.go
  • libbeat/processors/actions/copy_fields.go

err := f.renameField(field.From, field.To, event)
if err != nil {
errMsg := fmt.Errorf("Failed to rename fields in processor: %w", err)
errMsg := fmt.Errorf("failed to rename fields in processor: %w", err)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check if test expectations match the new lowercase error message
rg -n "Failed to rename fields in processor" libbeat/processors/actions/

Repository: elastic/beats

Length of output: 383


🏁 Script executed:

# Check the actual error message in rename.go line 86
sed -n '85,87p' libbeat/processors/actions/rename.go

Repository: elastic/beats

Length of output: 212


🏁 Script executed:

# Search for any lowercase "failed to rename" in the test file
rg -n "failed to rename" libbeat/processors/actions/rename_test.go

Repository: elastic/beats

Length of output: 39


🏁 Script executed:

# Check the test assertions more closely to confirm they compare error messages
sed -n '95,105p' libbeat/processors/actions/rename_test.go
sed -n '178,204p' libbeat/processors/actions/rename_test.go

Repository: elastic/beats

Length of output: 931


Tests will fail due to error message case mismatch

The error message in rename.go line 86 uses lowercase ("failed to rename..."), but test assertions in rename_test.go lines 99 and 198 expect capital F ("Failed to rename..."). These string comparisons via reflect.DeepEqual will fail. Either revert the error message to capital F or update both test assertions.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@libbeat/processors/actions/rename.go` at line 86, The error string
constructed in rename.go (variable errMsg created via fmt.Errorf in the rename
processor) uses a lowercase "failed to rename..." which mismatches tests
expecting "Failed to rename..."; update the fmt.Errorf call that sets errMsg to
use a capitalized "Failed to rename fields in processor: %w" so the error
message matches the assertions in rename_test.go (or alternatively update the
tests if you prefer changing expectations).

Eliminate event.Clone() in dissect (check-then-write), rename, copy,
replace, truncate, urldecode, extract_array, decode_csv, decode_base64,
decompress_gzip, and append processors.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@strawgate strawgate force-pushed the perf/skip-unnecessary-clone branch from 6d5ad2d to 7edb9a7 Compare March 30, 2026 13:49
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
libbeat/processors/actions/rename.go (1)

79-80: ⚠️ Potential issue | 🔴 Critical

fail_on_error can still lose data in single-field rename

Line 79 skips backup for single-field non-overlapping renames, but Lines 118-127 still do delete-before-put. If PutValue fails (e.g., blocked destination path), source data is already deleted and cannot be restored despite fail_on_error: true.

Proposed fix
 func (f *renameFields) renameField(from string, to string, event *beat.Event) error {
@@
-	// Deletion must happen first to support cases where a becomes a.b
-	err = event.Delete(from)
-	if err != nil {
-		return fmt.Errorf("could not delete key: %s,  %w", from, err)
-	}
-
-	_, err = event.PutValue(to, value)
-	if err != nil {
-		return fmt.Errorf("could not put value: %s: %v, %w", to, value, err)
-	}
+	overlap := strings.HasPrefix(to, from+".") || strings.HasPrefix(from, to+".")
+	if !overlap {
+		if _, err = event.PutValue(to, value); err != nil {
+			return fmt.Errorf("could not put value: %s: %v, %w", to, value, err)
+		}
+		if err = event.Delete(from); err != nil {
+			_, _ = event.Delete(to) // best-effort local rollback
+			return fmt.Errorf("could not delete key: %s, %w", from, err)
+		}
+		return nil
+	}
+
+	// overlap case still requires delete-first
+	err = event.Delete(from)
+	if err != nil {
+		return fmt.Errorf("could not delete key: %s,  %w", from, err)
+	}
+	_, err = event.PutValue(to, value)
+	if err != nil {
+		return fmt.Errorf("could not put value: %s: %v, %w", to, value, err)
+	}
 	return nil
 }

Also applies to: 118-127, 140-152

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@libbeat/processors/actions/rename.go` around lines 79 - 80, The current logic
in rename.go only creates a backup when renameNeedsClone(f.config) is true,
which skips backups for single-field non-overlapping renames even though later
code paths (uses of PutValue and delete-before-put in the blocks around where
PutValue is called) perform a delete-before-put; to fix, ensure that when
f.config.FailOnError is true you create a backup (e.g., call event.Clone into
backup) for all rename cases that perform delete-before-put (including the
single-field path), or alternatively change the operation order in the rename
implementation so you perform PutValue (destination write) before deleting the
source, and on failure restore from the backup; update the code paths referenced
by renameNeedsClone, the backup variable, and the PutValue/delete logic (the
blocks around PutValue and the delete-before-put sections) so fail_on_error
truly preserves source data.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@libbeat/processors/actions/rename.go`:
- Around line 79-80: The current logic in rename.go only creates a backup when
renameNeedsClone(f.config) is true, which skips backups for single-field
non-overlapping renames even though later code paths (uses of PutValue and
delete-before-put in the blocks around where PutValue is called) perform a
delete-before-put; to fix, ensure that when f.config.FailOnError is true you
create a backup (e.g., call event.Clone into backup) for all rename cases that
perform delete-before-put (including the single-field path), or alternatively
change the operation order in the rename implementation so you perform PutValue
(destination write) before deleting the source, and on failure restore from the
backup; update the code paths referenced by renameNeedsClone, the backup
variable, and the PutValue/delete logic (the blocks around PutValue and the
delete-before-put sections) so fail_on_error truly preserves source data.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 175d3224-5645-436b-9b15-696f89e8eb98

📥 Commits

Reviewing files that changed from the base of the PR and between 6d5ad2d and 7edb9a7.

📒 Files selected for processing (13)
  • changelog/fragments/1774840000-skip-unnecessary-event-clone.yaml
  • libbeat/processors/actions/append.go
  • libbeat/processors/actions/clone_skip_test.go
  • libbeat/processors/actions/copy_fields.go
  • libbeat/processors/actions/decode_base64_field.go
  • libbeat/processors/actions/decompress_gzip_field.go
  • libbeat/processors/actions/rename.go
  • libbeat/processors/actions/replace.go
  • libbeat/processors/actions/truncate_fields.go
  • libbeat/processors/decode_csv_fields/decode_csv_fields.go
  • libbeat/processors/dissect/processor.go
  • libbeat/processors/extract_array/extract_array.go
  • libbeat/processors/urldecode/urldecode.go
💤 Files with no reviewable changes (1)
  • libbeat/processors/actions/decode_base64_field.go
✅ Files skipped from review due to trivial changes (2)
  • changelog/fragments/1774840000-skip-unnecessary-event-clone.yaml
  • libbeat/processors/actions/replace.go
🚧 Files skipped from review as they are similar to previous changes (6)
  • libbeat/processors/actions/truncate_fields.go
  • libbeat/processors/actions/append.go
  • libbeat/processors/extract_array/extract_array.go
  • libbeat/processors/actions/copy_fields.go
  • libbeat/processors/actions/decompress_gzip_field.go
  • libbeat/processors/dissect/processor.go

@github-actions
Copy link
Copy Markdown
Contributor

TL;DR

All 4 failed Buildkite jobs are the same deterministic test regression: error-message casing changed from "Failed ..." to "failed ...", breaking strict equality assertions in processor tests.

Remediation

  • Restore original message text casing in:
    • libbeat/processors/actions/decompress_gzip_field.go (Run, errMsg format string)
    • libbeat/processors/actions/rename.go (Run, errMsg format string)
  • Re-run: cd libbeat && go test ./processors/actions -run 'TestDecompressGzip|TestRenameRun' -count=1 then re-run the 4 Buildkite steps.
Investigation details

Root Cause

This is a code/test mismatch introduced by commit 7edb9a73d66fbf3ca6a397c36490992d8b97358e (perf: skip unnecessary event.Clone() in processors).

In that commit, these strings were changed to lowercase:

  • libbeat/processors/actions/decompress_gzip_field.go (Run):
    • now: "failed to decompress field in decompress_gzip_field processor: %w"
  • libbeat/processors/actions/rename.go (Run):
    • now: "failed to rename fields in processor: %w"

But tests still expect uppercase "Failed ...":

  • libbeat/processors/actions/decompress_gzip_field_test.go:106,143
  • libbeat/processors/actions/rename_test.go:99,198

Evidence

  • Build: https://buildkite.com/elastic/beats/builds/43227
  • Failed steps (all show same signature):
    • Libbeat Unit Tests
    • Libbeat Go Unit Tests (FIPS provider)
    • Libbeat fips140=only Unit Tests
    • Libbeat Go Integration Tests
  • Key log excerpt (/tmp/gh-aw/buildkite-logs/...unit-tests.txt):
    • expected: "Failed to decompress field ..."
    • actual: "failed to decompress field ..."
  • Rename failures (TestRenameRun/...) also fail after lowercase failed to rename fields ... is logged, consistent with expected uppercase strings in rename_test.go.

Verification

  • Ran locally: cd libbeat && go test ./processors/actions -run 'TestDecompressGzip|TestRenameRun' -count=1 (passes on current main, confirming tests are sensitive to message text and that the PR commit behavior diverges from expected strings in CI logs).

Follow-up

If lowercase messages are intentional, update the expected test outputs in decompress_gzip_field_test.go and rename_test.go accordingly. Otherwise, reverting message casing in processor code is the smallest fix.

Note

🔒 Integrity filtering filtered 3 items

Integrity filtering activated and filtered the following items during workflow execution.
This happens when a tool call accesses a resource that does not meet the required integrity or secrecy level of the workflow.

  • pr:perf: skip unnecessary event.Clone() in processors for single-field ops #49777 (pull_request_read: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".)
  • #49723 (search_issues: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".)
  • #49777 (search_issues: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".)

What is this? | From workflow: PR Buildkite Detective

Give us feedback! React with 🚀 if perfect, 👍 if helpful, 👎 if not.

@pierrehilbert pierrehilbert added the Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team label Mar 30, 2026
@elasticmachine
Copy link
Copy Markdown
Contributor

Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane)

@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Mar 30, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport-skip Skip notification from the automated backport with mergify skip-changelog Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants