You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/metrics.md
+25-7Lines changed: 25 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -28,10 +28,14 @@ However, you can enable also additional metrics by listing all the metrics you w
28
28
Default metrics
29
29
| Type | Name | Labels | Description |
30
30
| :--- | :---- | :---- | :---- |
31
-
| gauge | ovms_streams | name,version | Number of OpenVINO execution streams |
32
-
| gauge | ovms_current_requests | name,version | Number of requests being currently processed by the model server |
31
+
| gauge | ovms_streams | name,version | Number of OpenVINO execution streams. |
32
+
| gauge | ovms_current_requests | name,version | Number of requests being currently processed by the model server. |
33
+
| gauge | ovms_current_graphs | name | Number of MediaPipe graphs in process. |
33
34
| counter | ovms_requests_success | api,interface,method,name,version | Number of successful requests to a model or a DAG. |
34
35
| counter | ovms_requests_fail | api,interface,method,name,version | Number of failed requests to a model or a DAG. |
36
+
| counter | ovms_requests_accepted | api,interface,method,name | Number of accepted requests which ended up inserting packet(s) into a MediaPipe graph. |
37
+
| counter | ovms_requests_rejected | api,interface,method,name | Number of rejected which failed at MediaPipe packet creation step. |
38
+
| counter | ovms_responses | api,interface,method,name | Number of responses generated by the MediaPipe graph. |
35
39
| histogram | ovms_request_time_us | interface,name,version | Processing time of requests to a model or a DAG. |
36
40
| histogram | ovms_inference_time_us | name,version | Inference execution time in the OpenVINO backend. |
37
41
| histogram | ovms_wait_for_infer_req_time_us | name,version | Request waiting time in the scheduling queue. Indicates how long the request has to wait before required resources are assigned to it. |
@@ -47,11 +51,11 @@ Optional metrics
47
51
Labels description
48
52
| Name | Values | Description |
49
53
| :--- | :---- | :---- |
50
-
| api | KServe, TensorFlowServing | Name of the serving API. |
54
+
| api | KServe, TensorFlowServing, V3| Name of the serving API. |
51
55
| interface | REST, gRPC | Name of the serving interface. |
| version | 1, 2, ..., n | Model version. Note that GetModelStatus and ModelReady and all MediaPipe servables do not have the version label. |
58
+
| name | As defined in model server config | Model name, DAG name or MediaPipe graph name. |
55
59
56
60
57
61
## Enable metrics
@@ -175,10 +179,14 @@ echo '{
175
179
"metrics_list":
176
180
[ "ovms_requests_success",
177
181
"ovms_requests_fail",
182
+
"ovms_requests_accepted",
183
+
"ovms_requests_rejected",
184
+
"ovms_responses",
178
185
"ovms_inference_time_us",
179
186
"ovms_wait_for_infer_req_time_us",
180
187
"ovms_request_time_us",
181
188
"ovms_current_requests",
189
+
"ovms_current_graphs",
182
190
"ovms_infer_req_active",
183
191
"ovms_streams",
184
192
"ovms_infer_req_queue_size"]
@@ -224,7 +232,17 @@ It means that each request to the DAG pipeline will update also the metrics for
224
232
225
233
## Metrics implementation for MediaPipe Graphs
226
234
227
-
For [MediaPipe Graphs](./mediapipe.md) metrics endpoint is not supported.
235
+
For [MediaPipe Graphs](./mediapipe.md) execution there are 4 generic metrics which apply to all graphs:
236
+
237
+
| Type | Name | Description |
238
+
| :--- | :---- | :---- |
239
+
| counter | ovms_requests_accepted | Counts number of requests which ended up pushing MediaPipe packet down the graph stream. For example image frame in vision use cases, LLM prompt in text generation use cases. |
240
+
| counter | ovms_requests_rejected | Counts errors in MediaPipe packet creation phase. For example bad image format in vision use cases. Please note that for V3 API, the LLM request is validated at graph node level meaning that packet creation always succeeds. Please refer to specific graph definition and implementation. |
241
+
| counter | ovms_responses | Useful to track number of packets generated by MediaPipe graph. Keep in mind that single request may trigger production of multiple (or zero) packets, therefore tracking number of responses is complementary to tracking accepted requests. For example tracking streaming partial responses of LLM text generation graphs. |
242
+
| gauge | ovms_current_graphs | Number of graphs currently in-process. For unary communication it is equal to number of currently processing requests (each request initializes separate MediaPipe graph). For streaming communication it is equal to number of active client connections. Each connection is able to reuse the graph and decide when to delete it when the connection is closed. |
243
+
244
+
Exposing custom metrics in calculator implementations (MediaPipe graph nodes) is not supported yet.
0 commit comments