Skip to content

Bound Syslog/CEF message size to prevent unbounded memory growth#2452

Open
utpilla wants to merge 4 commits intoopen-telemetry:mainfrom
utpilla:utpilla/Add-Syslog-CEF-message-size-limit
Open

Bound Syslog/CEF message size to prevent unbounded memory growth#2452
utpilla wants to merge 4 commits intoopen-telemetry:mainfrom
utpilla:utpilla/Add-Syslog-CEF-message-size-limit

Conversation

@utpilla
Copy link
Copy Markdown
Contributor

@utpilla utpilla commented Mar 28, 2026

Change Summary

Introduces a 16 KiB per-message size limit (MAX_MESSAGE_SIZE) for the syslog CEF receiver to prevent unbounded memory growth from malicious or misbehaving clients.

Before vs After

Aspect Before After
TCP message size Unbounded (memory exhaustion risk) Capped at 16 KiB per message
UDP receive buffer 1 KiB stack array (silent truncation) 16 KiB heap Vec
Truncation visibility None received_logs_truncated metric
TCP line buffer allocation Vec::new() (multiple early reallocs) Vec::with_capacity(4096)
Oversized message handling N/A Parsed and forwarded as partial data

Problem

  • TCP: read_until(b'\n') would keep appending bytes to a Vec until a newline appeared. A client sending data without newlines could exhaust memory.
  • UDP: The receive buffer was a 1 KiB stack array, too small for most real-world syslog messages and silently truncating them with no visibility.

Changes

  • Add read_line_bounded() helper that wraps AsyncReadExt::take() + AsyncBufReadExt::read_until() to cap TCP reads at MAX_MESSAGE_SIZE (16 KiB) per message. Returns a BoundedReadResult enum (Complete, Truncated, or Eof) so callers can distinguish outcomes.
  • Upgrade UDP receive buffer from 1 KiB stack array to 16 KiB heap Vec, matching the TCP limit.
  • Add received_logs_truncated metric to track messages that exceeded the size limit. For TCP, truncation is detected precisely. For UDP, it uses a best-effort heuristic (datagram fills buffer).
  • Pre-allocate TCP line buffer with Vec::with_capacity(4096) to avoid early reallocations for typical messages.
  • Add doc comments explaining cancellation safety: buf may contain partial data from select! cancellation and must not be cleared between calls unless discarding the current message.

Design decisions

  • Truncated messages are still forwarded: Preserves partial data (syslog header, timestamp, severity) rather than dropping it. The tail of an oversized message is also processed as its own message.
  • Metric semantics: received_logs_total counts each bounded read as a separate message. An oversized message that splits into head + tail counts as 2.

What issue does this PR close?

Towards #1149

How are these changes tested?

  • TCP truncation integration test: Sends an oversized message followed by a normal message, verifies all 3 records (truncated head, tail, normal) are received.
  • 6 unit tests for read_line_bounded: Empty reader, complete line, truncation, exact-limit edge case, EOF with partial data, and multi-call behavior after truncation.
  • Telemetry test updates: Metric index adjustments to account for the new received_logs_truncated field.

Are there any user-facing changes?

No

@utpilla utpilla requested a review from a team as a code owner March 28, 2026 00:57
@github-actions github-actions bot added the rust Pull requests that update Rust code label Mar 28, 2026
@codecov
Copy link
Copy Markdown

codecov bot commented Mar 28, 2026

Codecov Report

❌ Patch coverage is 98.90710% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 88.40%. Comparing base (18fcc9e) to head (2da4083).

Additional details and impacted files
@@           Coverage Diff            @@
##             main    #2452    +/-   ##
========================================
  Coverage   88.40%   88.40%            
========================================
  Files         603      603            
  Lines      213389   213561   +172     
========================================
+ Hits       188639   188794   +155     
- Misses      24224    24241    +17     
  Partials      526      526            
Components Coverage Δ
otap-dataflow 90.42% <98.90%> (-0.01%) ⬇️
query_abstraction 80.61% <ø> (ø)
query_engine 90.74% <ø> (ø)
syslog_cef_receivers ∅ <ø> (∅)
otel-arrow-go 52.44% <ø> (ø)
quiver 91.94% <ø> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

let _ = peer_addr;

let mut line_bytes = Vec::new();
let mut line_bytes = Vec::with_capacity(4096);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets move to a const?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 2da4083.


let socket = effect_handler.udp_socket(udp_config.listening_addr)?;
let mut buf = [0u8; 1024]; // ToDo: Find out the maximum allowed size for syslog messages
let mut buf = vec![0u8; MAX_MESSAGE_SIZE];
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wondering why would we allocate for worst possible scenario? Could we be more conservative, and accept that vec! will re-grow if we actually has such a big message? Maybe 1/2 or 1/4 of the MAX size for initial size?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The UDP receive buffer is fundamentally different from TCP's Vec. recv_from writes into a fixed &mut [u8] slice. It doesn't append or grow. If the buffer is smaller than the incoming datagram, the OS kernel silently discards the excess bytes with no way to recover them. So, we must allocate the full MAX_MESSAGE_SIZE upfront to avoid permanent data loss. This is a one-time 16 KiB allocation for the lifetime of the receiver (one per core).

max_size: usize,
) -> std::io::Result<BoundedReadResult> {
let n = (&mut *reader)
.take(max_size as u64)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this doesn’t fully enforce the size cap.

Because the read is inside tokio::select!, it can be interrupted if the batch timer fires before a newline arrives. When that happens, any bytes already appended to line_bytes stay in the buffer, and the next loop iteration starts reading the same message again.

I think read_line_bounded() should consider how much is already in buf and only allow reading the remaining space. If buf is already at the limit, it should return Truncated immediately.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great catch @lalitb! Thank you! Fixed in 35c5f1d.

// Handle incomplete line (log, emit metrics, etc.)
Ok(bounded_result) => {
if matches!(bounded_result, BoundedReadResult::Truncated) {
metrics.borrow_mut().received_logs_truncated.inc();
Copy link
Copy Markdown
Member

@lalitb lalitb Mar 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see from the PR description that splitting an oversized TCP message into head + tail is intentional, but I’m not sure this is the right semantic for a single syslog message.

In this branch we parse and emit the truncated prefix, clear line_bytes, and then continue reading, which means the remaining bytes from the same original message get emitted as a second record. That makes one input message turn into multiple downstream records, and received_logs_total no longer maps cleanly to original messages.

Why do we want to preserve and emit the tail as a separate logical message instead of either keeping a single truncated record or discarding until the next newline after truncation?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we want to preserve and emit the tail as a separate logical message instead of either keeping a single truncated record or discarding until the next newline after truncation?

The truncated head preserves the syslog header (timestamp, hostname, severity, facility) which is the most valuable structured metadata. The tail preserves part of the message body. Discarding either means losing data that could be useful for debugging.

The receiver's abstraction is "newline-delimited messages up to 16 KiB each." From its perspective, the head and tail are two separate protocol-level messages. The received_logs_truncated metric (incremented only once per oversized message, not per fragment) alerts operators that truncation is occurring, so they can investigate if needed.

Messages exceeding 16 KiB are extremely rare in practice. When one does occur, it's likely an unusual event worth investigating which makes preserving partial data better than silently dropping it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

rust Pull requests that update Rust code

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

3 participants