-
Notifications
You must be signed in to change notification settings - Fork 669
Description
Is your feature request related to a problem? Please describe.
When the metrics-generator's registry.stale_duration expires for a series, the series is deleted from the internal registry and stops being emitted. However, no stale marker (the special NaN value 0x7ff0000000000002) is sent to the remote write target.
This means Prometheus has no way to know the series is gone. It continues returning the last written sample for up to 5 minutes (the hardcoded lookback delta), producing misleading query results. For example, if a service processes one request and then stops, traces_spanmetrics_calls_total{service="my-service"} continues to return 1 for 5 minutes after the last span was received.
This violates the Prometheus Remote-Write 1.0 specification, which states:
Senders MUST send stale markers when a time series will no longer be appended to.
Using rate() mitigates the issue for counters, but gauges and instant queries remain affected. Dashboards and alerting rules that rely on the presence/absence of a series (e.g., absent(), up-style checks) are also impacted.
Environment:
- Tempo 2.9.0, single-binary mode on Kubernetes (Helm chart 1.24.4)
- Prometheus via kube-prometheus-stack, remote write receiver enabled
registry.collection_interval: 1s,registry.stale_duration: 3s
Describe the solution you'd like
When a series is deleted from the registry after stale_duration, the metrics-generator should emit one final remote write request containing a stale marker for that series. This would allow Prometheus (and any remote-write-compatible TSDB) to immediately mark the series as stale, matching the behavior of scrape-based ingestion.
The implementation would likely be in the registry's collection loop (modules/generator/registry), at the point where stale series are pruned. Before removing a series, write a sample with the stale NaN value and include it in the next remote write batch.
Describe alternatives you've considered
- Relying on
rate()/increase()for all queries: Works for counters but not for gauges or presence-based alerting. - Shorter
stale_duration: Reduces the window but doesn't eliminate it — Prometheus still shows the last value for up to 5 minutes. - Reducing Prometheus lookback delta: Not configurable per-series; changing it globally affects all queries and can break scrape-based series with irregular intervals.
Additional context
- Related: Metrics-generator production readiness #1303 (metrics-generator production readiness) mentions stale series cleanup but focuses on registry memory, not remote write behavior.
- The Prometheus remote write spec explicitly requires stale markers: https://prometheus.io/docs/specs/prw/remote_write_spec/#stale-markers
- Happy to contribute an implementation if the maintainers agree with the approach.