Skip to content

Commit d3fdcc8

Browse files
authored
apm-server: Add missing TBS monitoring metrics docs (#3912)
Add `apm-server.sampling.tail.events.failed_writes`, `apm-server.sampling.tail.events.sampled`, `apm-server.sampling.tail.events.head_unsampled`. Corresponds to elastic/apm-server#14247
1 parent bae19bd commit d3fdcc8

File tree

1 file changed

+24
-4
lines changed

1 file changed

+24
-4
lines changed

solutions/observability/apm/apm-server/tail-based-sampling.md

Lines changed: 24 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -160,25 +160,45 @@ APM Server produces metrics to monitor the performance and estimate the workload
160160

161161
This metric tracks the number of dynamic services that the tail-based sampler is tracking per policy. Dynamic services are created for tail-based sampling policies that are defined without a `service.name`.
162162

163-
This is a counter metric so, should be visualized with `counter_rate`.
163+
This is a counter metric, so it should be visualized with `counter_rate`.
164164

165165
### `apm-server.sampling.tail.events.processed` [sampling-tail-monitoring-events-processed-ref]
166166

167167
This metric tracks the total number of events (including both transaction and span) processed by the tail-based sampler.
168168

169-
This is a counter metric so, should be visualized with `counter_rate`.
169+
This is a counter metric, so it should be visualized with `counter_rate`.
170170

171171
### `apm-server.sampling.tail.events.stored` [sampling-tail-monitoring-events-stored-ref]
172172

173173
This metric tracks the total number of events stored by the tail-based sampler in the database. Events are stored when the full trace is not yet available to make the sampling decision. This value is directly proportional to the storage required by the tail-based sampler to function.
174174

175-
This is a counter metric so, should be visualized with `counter_rate`.
175+
This is a counter metric, so it should be visualized with `counter_rate`.
176176

177177
### `apm-server.sampling.tail.events.dropped` [sampling-tail-monitoring-events-dropped-ref]
178178

179179
This metric tracks the total number of events dropped by the tail-based sampler. Only the events that are actually dropped by the tail-based sampler are reported as dropped. Additionally, any events that were stored by the processor but never indexed will not be counted by this metric.
180180

181-
This is a counter metric so, should be visualized with `counter_rate`.
181+
This is a counter metric, so it should be visualized with `counter_rate`.
182+
183+
### `apm-server.sampling.tail.events.failed_writes` [sampling-tail-monitoring-events-failed-writes-ref]
184+
185+
This metric tracks the total number of events that failed to be written to the tail-based sampling storage. Failed writes typically occur when the storage limit is reached or when there are issues with the local sampling database.
186+
187+
The value of this metric should be 0 if tail-based sampling is functioning properly. If it is consistently increasing, check for misconfigured [storage limit](#sampling-tail-storage_limit-ref).
188+
189+
This is a counter metric, so it should be visualized with `counter_rate`.
190+
191+
### `apm-server.sampling.tail.events.sampled` [sampling-tail-monitoring-events-sampled-ref]
192+
193+
This metric tracks the total number of events that were sampled (kept) by the tail-based sampler after applying the configured policies and were selected for indexing. This includes all events that belong to traces that matched tail-based sampling policies.
194+
195+
This is a counter metric, so it should be visualized with `counter_rate`.
196+
197+
### `apm-server.sampling.tail.events.head_unsampled` [sampling-tail-monitoring-events-head-unsampled-ref]
198+
199+
This metric tracks the total number of events that were already unsampled by head-based sampling before reaching the tail-based sampler. These events are processed by the tail-based sampler but are not stored or indexed because they were already filtered out by head-based sampling decisions.
200+
201+
This is a counter metric, so it should be visualized with `counter_rate`.
182202

183203
### `apm-server.sampling.tail.storage.lsm_size` [sampling-tail-monitoring-storage-lsm-size-ref]
184204

0 commit comments

Comments
 (0)