Skip to content

Fix double newline in NDJSON bulk body when using RawEncoding#49557

Open
trilamsr wants to merge 5 commits intoelastic:mainfrom
trilamsr:fix/bulk-raw-encoding-double-newline
Open

Fix double newline in NDJSON bulk body when using RawEncoding#49557
trilamsr wants to merge 5 commits intoelastic:mainfrom
trilamsr:fix/bulk-raw-encoding-double-newline

Conversation

@trilamsr
Copy link
Copy Markdown

@trilamsr trilamsr commented Mar 19, 2026

Proposed commit message

Fix double newline in NDJSON bulk body when using RawEncoding

When events are pre-encoded in the queue (via event_encoder.go), the encoded bytes include a trailing newline from the Marshal/AddRaw call. When the bulk body assembler later writes these bytes via AddRaw(RawEncoding{...}), it unconditionally appends another newline, producing an empty line (\n\n) in the NDJSON bulk body.

While Elasticsearch tolerates empty lines in bulk requests, ES-compatible endpoints like Axiom and OpenSearch reject them with:

400 Bad Request: invalid event at index 1: ReadObject: expect { or ,
or } or n, but found \u0000

Closes #49558

What does this PR do?

Checks whether RawEncoding bytes already end with a newline and skips the additional WriteByte('\n') / Write(nl) if so. Applied to both jsonEncoder and gzipEncoder paths.

Backward compatible: RawEncoding bytes without a trailing newline still get the newline appended.

How to test this locally

cd libbeat/esleg/eslegclient
go test -run "TestRawEncodingNoDoubleNewline|TestEncoderHeaders" -v

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective
  • I have added a changelog fragment

🤖 Generated with Claude Code

@trilamsr trilamsr requested review from a team as code owners March 19, 2026 07:15
@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Mar 19, 2026
@github-actions
Copy link
Copy Markdown
Contributor

🤖 GitHub comments

Just comment with:

  • run docs-build : Re-trigger the docs validation. (use unformatted text in the comment!)

@cla-checker-service
Copy link
Copy Markdown

cla-checker-service bot commented Mar 19, 2026

💚 CLA has been signed

@mergify
Copy link
Copy Markdown
Contributor

mergify bot commented Mar 19, 2026

This pull request is now in conflicts. Could you fix it? 🙏
To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b fix/bulk-raw-encoding-double-newline upstream/fix/bulk-raw-encoding-double-newline
git merge upstream/main
git push upstream fix/bulk-raw-encoding-double-newline

@mergify
Copy link
Copy Markdown
Contributor

mergify bot commented Mar 19, 2026

This pull request does not have a backport label.
If this is a bug or security fix, could you label this PR @trilamsr? 🙏.
For such, you'll need to label your PR with:

  • The upcoming major version of the Elastic Stack
  • The upcoming minor version of the Elastic Stack (if you're not pushing a breaking change)

To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

  • backport-8./d is the label to automatically backport to the 8./d branch. /d is the digit
  • backport-active-all is the label that automatically backports to all active branches.
  • backport-active-8 is the label that automatically backports to all active minor branches for the 8 major.
  • backport-active-9 is the label that automatically backports to all active minor branches for the 9 major.

When events are pre-encoded in the queue (via event_encoder.go), the
encoded bytes include a trailing newline from the Marshal/AddRaw call.
When the bulk body assembler later writes these bytes via
AddRaw(RawEncoding{...}), it unconditionally appends another newline,
producing an empty line (\n\n) in the NDJSON bulk body.

While Elasticsearch tolerates empty lines in bulk requests,
ES-compatible endpoints like Axiom and OpenSearch reject them with:

  400 Bad Request: invalid event at index 1: ReadObject: expect { or ,
  or } or n, but found \u0000

The fix checks whether RawEncoding bytes already end with a newline
and skips the additional one if so. This preserves backward
compatibility: RawEncoding bytes without a trailing newline (e.g. from
json.Marshal) still get the newline appended as before.

Applied to both jsonEncoder and gzipEncoder paths.
@trilamsr trilamsr force-pushed the fix/bulk-raw-encoding-double-newline branch from 138d525 to 1c08869 Compare March 19, 2026 07:17
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 19, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 03270337-036a-4a5e-afb3-5a56357a374d

📥 Commits

Reviewing files that changed from the base of the PR and between 4b9ffc6 and f488e8d.

📒 Files selected for processing (2)
  • changelog/fragments/1773900000-fix-bulk-raw-encoding-double-newline.yaml
  • libbeat/esleg/eslegclient/enc_test.go
✅ Files skipped from review due to trivial changes (1)
  • changelog/fragments/1773900000-fix-bulk-raw-encoding-double-newline.yaml

📝 Walkthrough

Walkthrough

Updates prevent double-newline insertion when appending pre-encoded NDJSON via RawEncoding. The encoders (jsonEncoder.AddRaw and gzipEncoder.AddRaw) now write the provided v.Encoding and, if it already ends with '\n', return without adding an extra newline; otherwise they append the standard terminator. A unit test (TestRawEncodingNoDoubleNewline) was added to verify bulk body composition for both newline-terminated and non-terminated raw documents.

🚥 Pre-merge checks | ✅ 2
✅ Passed checks (2 passed)
Check name Status Explanation
Linked Issues check ✅ Passed All coding requirements from issue #49558 are met: double-newline bug fixed in both jsonEncoder and gzipEncoder [#49558], backward compatibility maintained for non-newline-terminated RawEncoding, comprehensive tests added.
Out of Scope Changes check ✅ Passed All changes are directly scoped to fixing the double-newline bug: encoder logic fixes, corresponding tests, changelog entry, and minor refactoring (map[string]any) in existing test.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • 🛠️ Update Documentation: Commit on current branch
  • 🛠️ Update Documentation: Create PR

Comment @coderabbitai help to get the list of available commands and usage tips.

When events are pre-encoded in the queue (via event_encoder.go), the
encoded bytes include a trailing newline from the Marshal/AddRaw call.
When the bulk body assembler later writes these bytes via
AddRaw(RawEncoding{...}), it unconditionally appends another newline,
producing an empty line (\n\n) in the NDJSON bulk body.

While Elasticsearch tolerates empty lines in bulk requests,
ES-compatible endpoints like Axiom and OpenSearch reject them with:

  400 Bad Request: invalid event at index 1: ReadObject: expect { or ,
  or } or n, but found \u0000

The fix checks whether RawEncoding bytes already end with a newline
and skips the additional one if so. This preserves backward
compatibility: RawEncoding bytes without a trailing newline (e.g. from
json.Marshal) still get the newline appended as before.

Applied to both jsonEncoder and gzipEncoder paths.
@trilamsr trilamsr force-pushed the fix/bulk-raw-encoding-double-newline branch from 457c007 to abeb91b Compare March 19, 2026 16:46
@v1v v1v removed the request for review from a team March 24, 2026 07:25
@cmacknz cmacknz added the Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team label Mar 25, 2026
@elasticmachine
Copy link
Copy Markdown
Contributor

Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane)

@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Mar 25, 2026
@cmacknz cmacknz removed request for a team, blakerouse and michel-laterman March 25, 2026 20:26
@trilamsr
Copy link
Copy Markdown
Author

Hey @jmikell821 @leehinman @VihasMakwana do you guys have update on when this can be merged and for next release?

@leehinman
Copy link
Copy Markdown
Contributor

@VihasMakwana I added testing for the gzip encoder. When you get a chance can you please take a look.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Filebeat 9.x bulk requests contain double newline, breaking ES-compatible endpoints

5 participants