Skip to content

Native Histograms #1831

@Elvis339

Description

@Elvis339

Classic Prometheus histograms require pre-defined fixed bucket boundaries (e.g. [0.005, 0.01, 0.025, 0.05, ...]). This has fundamental drawbacks:

  • Buycket boundaries must be guessed upfront, before you know your latency distribution
  • Wrong buckets produce poor quantile resolution or wasted cardinality
  • Every bucket is a separate time series, making high-cardinality histograms expensive

Native histograms solve this by using an exponential bucket schema where boundaries are computed automatically via 2^(i / 2^schema). Key properties:

  • No pre-defined buckets the schema determines resolution, not you
  • Only buckets with observations are stored, so sparse distributions are cheap
  • The entire histogram is a single time series regardless of resolution
  • Quantile accuracy improves because buckets self-adapt to where data actually lands

This means the bucket-tuning work originally scoped in #1736 becomes largely obsolete we don't need to figure out the right bucket distribution if we're not defining buckets at all.

Current State

The histograms as currently defined have several problems that need to be addressed regardless of native histogram support:

  1. Broken naming
  2. Redundant manual counters.
  3. Shared bucket distribution - all three histograms share one bucket config that wasn't chosen with any specific latency profile in mind

Read more here #1736

Proposed Plan

  1. Histogram metrics followup #1736
  2. Coordination with SREs
  3. Switch to native histograms - replace classic histograms

Point 3. will be broken down by PRs and more in-depth explanations.

Metadata

Metadata

Assignees

Labels

Research ProjectProposed ideas that need further research before formalizing a design.observabilitylogging, tracing and metrics

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions