feat(consumer): Emit batch_write_bytes metric from RowBinary writer by onewland · Pull Request #7863 · getsentry/snuba

onewland · 2026-04-07T22:06:09Z

Summary

Adds the insertions.batch_write_bytes counter to the RowBinary ClickHouse writer, matching what the JSONEachRow writer already emits
This is Phase 0 of the byte-based batching rollout — we need to observe batch byte sizes on eap_items in s4s2/de (which use RowBinary) to pick a data-driven threshold for --max-batch-size-calculation=bytes

Refs EAP-460

Test plan

cargo check passes
Deploy to s4s2 and confirm insertions.batch_write_bytes appears in DataDog for RowBinary consumers
Use DataDog arithmetic (bytes/msgs) to approximate per-batch byte size distributions across all environments

🤖 Generated with Claude Code

The RowBinary writer was missing the insertions.batch_write_bytes counter that the JSONEachRow writer already emits. Adding it here so we can observe batch byte sizes across all eap_items consumers (s4s2/de use RowBinary) as part of Phase 0 of the byte-based batching rollout. Refs EAP-460 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

linear-code · 2026-04-07T22:06:13Z

EAP-460 Add max_batch_size_bytes to consumer

sentry · 2026-04-07T22:09:20Z

rust_snuba/src/strategies/clickhouse/row_binary_writer.rs

                }
            }

+            counter!("insertions.batch_write_bytes", num_bytes as i64);
            counter!("insertions.batch_write_msgs", batch_len as i64);
            empty_batch.record_message_latency();
            empty_batch.emit_item_type_metrics();


Bug: The insertions.batch_write_bytes metric for the RowBinary writer uses an in-memory size estimate (estimated_size()) instead of the actual serialized byte count, making it inconsistent with the JSONEachRow writer.
_{Severity: MEDIUM}

Suggested Fix

The RowBinary writer should measure the size of the data after it has been serialized into the RowBinary format, not before. This can be achieved by getting the length of the serialized byte buffer, which will ensure the metric reflects the actual number of bytes written and is consistent with the JSONEachRow writer's implementation.

Prompt for AI Agent

Review the code at the location below. A potential bug has been identified by an AI agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not valid. Location: rust_snuba/src/strategies/clickhouse/row_binary_writer.rs#L127-L133 Potential issue: The `insertions.batch_write_bytes` metric for the RowBinary writer is calculated using `estimated_size()`, which measures the in-memory size of the data structures. This is inconsistent with the JSONEachRow writer, which measures the actual byte length of the serialized data. Because the RowBinary format uses variable-length encoding, the in-memory size is not an accurate representation of the on-the-wire size. This discrepancy will result in inaccurate and non-comparable metric data between the two writers, which could lead to incorrect decisions about batching thresholds and undermines the goal of standardizing this metric.

_{Did we get this right? 👍 / 👎 to inform future reviews.}

onewland requested a review from a team as a code owner April 7, 2026 22:06

onewland marked this pull request as draft April 7, 2026 22:08

sentry bot reviewed Apr 7, 2026

View reviewed changes

onewland marked this pull request as ready for review April 7, 2026 22:20

kylemumma approved these changes Apr 7, 2026

View reviewed changes

onewland merged commit a5c76af into master Apr 7, 2026
46 of 47 checks passed

onewland deleted the oliver/batch-byte-size-metric branch April 7, 2026 23:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(consumer): Emit batch_write_bytes metric from RowBinary writer#7863

feat(consumer): Emit batch_write_bytes metric from RowBinary writer#7863
onewland merged 1 commit intomasterfrom
oliver/batch-byte-size-metric

onewland commented Apr 7, 2026 •

edited

Loading

Uh oh!

linear-code bot commented Apr 7, 2026

Uh oh!

sentry bot Apr 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

onewland commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

linear-code bot commented Apr 7, 2026

Uh oh!

sentry bot Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

onewland commented Apr 7, 2026 •

edited

Loading