Skip to content

feat(metrics-util): Expose true count and sum on Drain for reservoir sampling#681

Open
michelhe wants to merge 3 commits intometrics-rs:mainfrom
michelhe:expose-unsampled-len
Open

feat(metrics-util): Expose true count and sum on Drain for reservoir sampling#681
michelhe wants to merge 3 commits intometrics-rs:mainfrom
michelhe:expose-unsampled-len

Conversation

@michelhe
Copy link
Contributor

@michelhe michelhe commented Mar 3, 2026

Summary

Expose two public methods on Drain that let downstream consumers recover accurate aggregate statistics when reservoir sampling drops values:

  • Drain::unsampled_len() — returns the total number of samples pushed into the reservoir, including those dropped by sampling. The field was already tracked internally; this just adds a public getter.
  • Drain::unsampled_sum() — returns the true sum of all pushed samples, including dropped ones. Atomically accumulated in Reservoir::push() using a fetch_update CAS loop on AtomicU64 (same pattern as GaugeFn::increment in metrics/src/atomics.rs). Reset in Drain::drop() alongside the existing count reset.

Motivation

When the reservoir overflows, downstream consumers draining the samples only see the sampled subset. A consumer naively counting or summing values in the drain loop would get results capped at reservoir capacity — not the true totals. For example, pushing 1000 values into a 16-slot reservoir means the drain yields only 16 samples, so a naive count gives 16 and a naive sum covers only those 16 retained values, while the true count is 1000 and the true sum is 499,500.

Previously, the only way to recover the true count was drained_count / sample_rate, which is a lossy float round-trip. There was no way to recover the true sum at all.

Tests

Unit tests covering: under-capacity (no sampling), overflow (proving true values survive), reset-after-drain (A/B buffer reuse), and empty reservoir.

michelhe added 3 commits March 3, 2026 16:19
Expose the total number of samples pushed into the reservoir
(including those dropped by sampling) as a public method on
Drain. This allows consumers to get the exact total pushed
count without a lossy float round-trip through sample_rate.

Signed-off-by: Michel Heily <michelheily@gmail.com>
Signed-off-by: Michel Heily <michelheily@gmail.com>
…ampled_sum()

Signed-off-by: Michel Heily <michelheily@gmail.com>
@michelhe michelhe changed the title feat(metrics-util): Add unsampled_len() getter to Drain feat(metrics-util): Expose true count and sum on Drain for reservoir sampling Mar 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant