Skip to content
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -363,3 +363,161 @@ Spring AI supports logging vector search response data, useful for troubleshooti
|===

WARNING: If you enable logging of the vector search response data, there's a risk of exposing sensitive or private information. Please, be careful!

== More Metrics Reference

This section documents the metrics emitted by Spring AI components as they appear in Prometheus.

=== Metric Naming Conventions

Spring AI uses Micrometer. Base metric names use dots (e.g., `gen_ai.client.operation`), which Prometheus exports with underscores and standard suffixes:

* **Timers** → `<base>_seconds_count`, `<base>_seconds_sum`, `<base>_seconds_max`, and (when supported) `<base>_active_count`
* **Counters** → `<base>_total` (monotonic)

[NOTE]
====
The following shows how base metric names expand to Prometheus time series.

[cols="2,3", options="header", stripes=even]
|===
| Base metric name | Exported time series
| `gen_ai.client.operation` |
`gen_ai_client_operation_seconds_count` +
`gen_ai_client_operation_seconds_sum` +
`gen_ai_client_operation_seconds_max` +
`gen_ai_client_operation_active_count`
| `db.vector.client.operation` |
`db_vector_client_operation_seconds_count` +
`db_vector_client_operation_seconds_sum` +
`db_vector_client_operation_seconds_max` +
`db_vector_client_operation_active_count`
|===
====

==== References

* OpenTelemetry — https://opentelemetry.io/docs/specs/semconv/gen-ai/[Semantic Conventions for Generative AI (overview)]
* Micrometer — https://docs.micrometer.io/micrometer/reference/concepts/naming.html[Naming Meters]

=== Chat Client Metrics

[cols="2,2,1,3", stripes=even]
|===
|Metric Name | Type | Unit | Description

|`gen_ai_chat_client_operation_seconds_sum`
|Timer
|seconds
|Total time spent in ChatClient operations (call/stream)

|`gen_ai_chat_client_operation_seconds_count`
|Counter
|count
|Number of completed ChatClient operations

|`gen_ai_chat_client_operation_seconds_max`
|Gauge
|seconds
|Maximum observed duration of ChatClient operations

|`gen_ai_chat_client_operation_active_count`
|Gauge
|count
|Number of ChatClient operations currently in flight
|===

*Active vs Completed*: `*_active_count` shows in-flight calls; the `_seconds_*` series reflect only completed calls.

=== Chat Model Metrics (Model provider execution)

[cols="2,2,1,3", stripes=even]
|===
|Metric Name | Type | Unit | Description

|`gen_ai_client_operation_seconds_sum`
|Timer
|seconds
|Total time executing chat model operations

|`gen_ai_client_operation_seconds_count`
|Counter
|count
|Number of completed chat model operations

|`gen_ai_client_operation_seconds_max`
|Gauge
|seconds
|Maximum observed duration for chat model operations

|`gen_ai_client_operation_active_count`
|Gauge
|count
|Number of chat model operations currently in flight
|===

==== Token Usage

[cols="2,2,1,3", stripes=even]
|===
|Metric Name | Type | Unit | Description

|`gen_ai_client_token_usage_total`
|Counter
|tokens
|Total tokens consumed, labeled by token type
|===

==== Labels

[cols="2,3", options="header", stripes=even]
|===
|Label | Meaning
|`gen_ai_token_type=input` | Prompt tokens sent to the model
|`gen_ai_token_type=output` | Completion tokens returned by the model
|`gen_ai_token_type=total` | Input + output
|===

=== Vector Store Metrics

[cols="2,2,1,3", stripes=even]
|===
|Metric Name | Type | Unit | Description

|`db_vector_client_operation_seconds_sum`
|Timer
|seconds
|Total time spent in vector store operations (add/delete/query)

|`db_vector_client_operation_seconds_count`
|Counter
|count
|Number of completed vector store operations

|`db_vector_client_operation_seconds_max`
|Gauge
|seconds
|Maximum observed duration for vector store operations

|`db_vector_client_operation_active_count`
|Gauge
|count
|Number of vector store operations currently in flight
|===

==== Labels

[cols="2,3", options="header", stripes=even]
|===
|Label | Meaning
|`db_operation_name` | Operation type (`add`, `delete`, `query`)
|`db_system` | Vector DB/provider (`redis`, `chroma`, `pgvector`, …)
|`spring_ai_kind` | `vector_store`
|===

=== Understanding Active vs Completed

* **Active (`*_active_count`)** — instantaneous gauge of in-progress operations (concurrency/load).
* **Completed (`*_seconds_sum|count|max`)** — statistics for operations that have finished:
* `_seconds_sum / _seconds_count` → average latency
* `_seconds_max` → high-water mark since last scrape (subject to registry behavior)
Loading