Skip to content

Commit a0ad921

Browse files
HeeChanNsobychacko
authored andcommitted
docs: Add more explanation in observability documentation about metrics
* Document base → Prometheus name mapping (gen_ai.client.operation → _seconds_count|sum|max, _active_count) * Explain Active vs Completed (LongTaskTimer vs Timer) * Add References to OTel/Micrometer/Prometheus specs Auto-cherry-pick to 1.0.x Fixes #4222 Signed-off-by: heechann <[email protected]>
1 parent edc1434 commit a0ad921

File tree

1 file changed

+158
-0
lines changed
  • spring-ai-docs/src/main/antora/modules/ROOT/pages/observability

1 file changed

+158
-0
lines changed

spring-ai-docs/src/main/antora/modules/ROOT/pages/observability/index.adoc

Lines changed: 158 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -363,3 +363,161 @@ Spring AI supports logging vector search response data, useful for troubleshooti
363363
|===
364364

365365
WARNING: If you enable logging of the vector search response data, there's a risk of exposing sensitive or private information. Please, be careful!
366+
367+
== More Metrics Reference
368+
369+
This section documents the metrics emitted by Spring AI components as they appear in Prometheus.
370+
371+
=== Metric Naming Conventions
372+
373+
Spring AI uses Micrometer. Base metric names use dots (e.g., `gen_ai.client.operation`), which Prometheus exports with underscores and standard suffixes:
374+
375+
* **Timers** → `<base>_seconds_count`, `<base>_seconds_sum`, `<base>_seconds_max`, and (when supported) `<base>_active_count`
376+
* **Counters** → `<base>_total` (monotonic)
377+
378+
[NOTE]
379+
====
380+
The following shows how base metric names expand to Prometheus time series.
381+
382+
[cols="2,3", options="header", stripes=even]
383+
|===
384+
| Base metric name | Exported time series
385+
| `gen_ai.client.operation` |
386+
`gen_ai_client_operation_seconds_count` +
387+
`gen_ai_client_operation_seconds_sum` +
388+
`gen_ai_client_operation_seconds_max` +
389+
`gen_ai_client_operation_active_count`
390+
| `db.vector.client.operation` |
391+
`db_vector_client_operation_seconds_count` +
392+
`db_vector_client_operation_seconds_sum` +
393+
`db_vector_client_operation_seconds_max` +
394+
`db_vector_client_operation_active_count`
395+
|===
396+
====
397+
398+
==== References
399+
400+
* OpenTelemetry — https://opentelemetry.io/docs/specs/semconv/gen-ai/[Semantic Conventions for Generative AI (overview)]
401+
* Micrometer — https://docs.micrometer.io/micrometer/reference/concepts/naming.html[Naming Meters]
402+
403+
=== Chat Client Metrics
404+
405+
[cols="2,2,1,3", stripes=even]
406+
|===
407+
|Metric Name | Type | Unit | Description
408+
409+
|`gen_ai_chat_client_operation_seconds_sum`
410+
|Timer
411+
|seconds
412+
|Total time spent in ChatClient operations (call/stream)
413+
414+
|`gen_ai_chat_client_operation_seconds_count`
415+
|Counter
416+
|count
417+
|Number of completed ChatClient operations
418+
419+
|`gen_ai_chat_client_operation_seconds_max`
420+
|Gauge
421+
|seconds
422+
|Maximum observed duration of ChatClient operations
423+
424+
|`gen_ai_chat_client_operation_active_count`
425+
|Gauge
426+
|count
427+
|Number of ChatClient operations currently in flight
428+
|===
429+
430+
*Active vs Completed*: `*_active_count` shows in-flight calls; the `_seconds_*` series reflect only completed calls.
431+
432+
=== Chat Model Metrics (Model provider execution)
433+
434+
[cols="2,2,1,3", stripes=even]
435+
|===
436+
|Metric Name | Type | Unit | Description
437+
438+
|`gen_ai_client_operation_seconds_sum`
439+
|Timer
440+
|seconds
441+
|Total time executing chat model operations
442+
443+
|`gen_ai_client_operation_seconds_count`
444+
|Counter
445+
|count
446+
|Number of completed chat model operations
447+
448+
|`gen_ai_client_operation_seconds_max`
449+
|Gauge
450+
|seconds
451+
|Maximum observed duration for chat model operations
452+
453+
|`gen_ai_client_operation_active_count`
454+
|Gauge
455+
|count
456+
|Number of chat model operations currently in flight
457+
|===
458+
459+
==== Token Usage
460+
461+
[cols="2,2,1,3", stripes=even]
462+
|===
463+
|Metric Name | Type | Unit | Description
464+
465+
|`gen_ai_client_token_usage_total`
466+
|Counter
467+
|tokens
468+
|Total tokens consumed, labeled by token type
469+
|===
470+
471+
==== Labels
472+
473+
[cols="2,3", options="header", stripes=even]
474+
|===
475+
|Label | Meaning
476+
|`gen_ai_token_type=input` | Prompt tokens sent to the model
477+
|`gen_ai_token_type=output` | Completion tokens returned by the model
478+
|`gen_ai_token_type=total` | Input + output
479+
|===
480+
481+
=== Vector Store Metrics
482+
483+
[cols="2,2,1,3", stripes=even]
484+
|===
485+
|Metric Name | Type | Unit | Description
486+
487+
|`db_vector_client_operation_seconds_sum`
488+
|Timer
489+
|seconds
490+
|Total time spent in vector store operations (add/delete/query)
491+
492+
|`db_vector_client_operation_seconds_count`
493+
|Counter
494+
|count
495+
|Number of completed vector store operations
496+
497+
|`db_vector_client_operation_seconds_max`
498+
|Gauge
499+
|seconds
500+
|Maximum observed duration for vector store operations
501+
502+
|`db_vector_client_operation_active_count`
503+
|Gauge
504+
|count
505+
|Number of vector store operations currently in flight
506+
|===
507+
508+
==== Labels
509+
510+
[cols="2,3", options="header", stripes=even]
511+
|===
512+
|Label | Meaning
513+
|`db_operation_name` | Operation type (`add`, `delete`, `query`)
514+
|`db_system` | Vector DB/provider (`redis`, `chroma`, `pgvector`, …)
515+
|`spring_ai_kind` | `vector_store`
516+
|===
517+
518+
=== Understanding Active vs Completed
519+
520+
* **Active (`*_active_count`)** — instantaneous gauge of in-progress operations (concurrency/load).
521+
* **Completed (`*_seconds_sum|count|max`)** — statistics for operations that have finished:
522+
* `_seconds_sum / _seconds_count` → average latency
523+
* `_seconds_max` → high-water mark since last scrape (subject to registry behavior)

0 commit comments

Comments
 (0)