-
Notifications
You must be signed in to change notification settings - Fork 25.6k
[Inference API] Record re-routing attributes as part of inference request metrics #122350
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 2 commits
528d676
0c1a171
02de5c3
5fc9bfd
22bfdff
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -14,14 +14,14 @@ | |
| import org.elasticsearch.telemetry.metric.LongCounter; | ||
| import org.elasticsearch.telemetry.metric.LongHistogram; | ||
| import org.elasticsearch.telemetry.metric.MeterRegistry; | ||
| import org.elasticsearch.xpack.core.inference.action.BaseInferenceActionRequest; | ||
|
|
||
| import java.util.Map; | ||
| import java.util.Objects; | ||
| import java.util.stream.Collectors; | ||
| import java.util.stream.Stream; | ||
|
|
||
| import static java.util.Map.entry; | ||
| import static java.util.stream.Stream.concat; | ||
|
|
||
| public record InferenceStats(LongCounter requestCount, LongHistogram inferenceDuration) { | ||
|
|
||
|
|
@@ -45,49 +45,43 @@ public static InferenceStats create(MeterRegistry meterRegistry) { | |
| ); | ||
| } | ||
|
|
||
| public static Map<String, Object> modelAttributes(Model model) { | ||
| return toMap(modelAttributeEntries(model)); | ||
| private static Map<String, Object> toMap(Stream<Map.Entry<String, Object>> stream) { | ||
| return stream.collect(Collectors.toMap(Map.Entry::getKey, Map.Entry::getValue)); | ||
| } | ||
|
|
||
| private static Stream<Map.Entry<String, Object>> modelAttributeEntries(Model model) { | ||
| public static Map<String, Object> modelAttributes(Model model) { | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Out of curiosity, why not just create a new I think the stream API is useful when we want to declaratively process a collection (filter, transformation, find one element etc), but in this case converting back and forth seems like an overhead. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It was originally used as a generator/builder to construct a map from multiple different objects in multiple different functions, but now that this change is moving away from that, we can probably just use There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Makes sense to adapt it, I'll create a small follow-up PR tomorrow as the CI is green now and we want the metric attributes to appear pretty soon in EC Serverless and EC Hosted, so we can start on the integration tests. |
||
| var stream = Stream.<Map.Entry<String, Object>>builder() | ||
| .add(entry("service", model.getConfigurations().getService())) | ||
| .add(entry("task_type", model.getTaskType().toString())); | ||
| if (model.getServiceSettings().modelId() != null) { | ||
| stream.add(entry("model_id", model.getServiceSettings().modelId())); | ||
| } | ||
| return stream.build(); | ||
| return toMap(stream.build()); | ||
| } | ||
|
|
||
| private static Map<String, Object> toMap(Stream<Map.Entry<String, Object>> stream) { | ||
| return stream.collect(Collectors.toMap(Map.Entry::getKey, Map.Entry::getValue)); | ||
| public static Map<String, Object> routingAttributes(BaseInferenceActionRequest request, String nodeIdHandlingRequest) { | ||
| return Map.of("rerouted", request.hasBeenRerouted(), "node_id", nodeIdHandlingRequest); | ||
| } | ||
|
|
||
| public static Map<String, Object> responseAttributes(Model model, @Nullable Throwable t) { | ||
| return toMap(concat(modelAttributeEntries(model), errorAttributes(t))); | ||
| } | ||
|
|
||
| public static Map<String, Object> responseAttributes(UnparsedModel model, @Nullable Throwable t) { | ||
| public static Map<String, Object> modelAttributes(UnparsedModel model) { | ||
| var unknownModelAttributes = Stream.<Map.Entry<String, Object>>builder() | ||
| .add(entry("service", model.service())) | ||
| .add(entry("task_type", model.taskType().toString())) | ||
| .build(); | ||
|
|
||
| return toMap(concat(unknownModelAttributes, errorAttributes(t))); | ||
| return toMap(unknownModelAttributes); | ||
| } | ||
|
|
||
| public static Map<String, Object> responseAttributes(@Nullable Throwable t) { | ||
| return toMap(errorAttributes(t)); | ||
| } | ||
|
|
||
| private static Stream<Map.Entry<String, Object>> errorAttributes(@Nullable Throwable t) { | ||
| return switch (t) { | ||
| case null -> Stream.of(entry("status_code", 200)); | ||
| var stream = switch (t) { | ||
| case null -> Stream.<Map.Entry<String, Object>>of(entry("status_code", 200)); | ||
| case ElasticsearchStatusException ese -> Stream.<Map.Entry<String, Object>>builder() | ||
| .add(entry("status_code", ese.status().getStatus())) | ||
| .add(entry("error.type", String.valueOf(ese.status().getStatus()))) | ||
| .build(); | ||
| default -> Stream.of(entry("error.type", t.getClass().getSimpleName())); | ||
| default -> Stream.<Map.Entry<String, Object>>of(entry("error.type", t.getClass().getSimpleName())); | ||
| }; | ||
|
|
||
| return toMap(stream); | ||
| } | ||
| } | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’m a bit concerned that the metric cardinality will grow extremely rapidly as you add the
node_id. I'm not 100% sure if this is a problem for Elasticsearch (overview cluster).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the attribute will have a high cardinality and not the metric, right?
I don't think that should be inherently a problem vs for example Prometheus, which creates a timeseries automatically for each unique metric/attribute pair. We'll only do (manual) aggregations on node id over a limited time window + the number of unique node ids in a (serverless) cluster shouldn't be very high for a timeframe of let's say 10 minutes - 1 day. I've also checked other Elasticsearch metrics and a lot of them include the node id as an attribute, so I guess, if it's not a problem for them it shouldn't be one for us.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Summarizing a slack conversation:
In addition to the cardinality risk during search, there seems to be a risk for high cardinality when the metrics are pushed from the node, where each variation creates a new element that consumes capacity in an outbound queue. The queue has some max capacity and is flushed periodically.
We think (hope) the risk is relatively low, given that nodes have 1 id and the routing ids should have a handful of ids.