-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Add ModelRegistryMetadata to Cluster State #121106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This commit integrates `MinimalServiceSettings` (introduced in elastic#120560) into the cluster state for all registered models in the `ModelRegistry`. These settings allow consumers to access configuration details without requiring asynchronous calls to retrieve full model configurations. To ensure consistency, the cluster state metadata must remain synchronized with the models in the inference index. If a mismatch is detected during startup, the master node performs an upgrade to load all model settings from the index.
|
Hi @jimczi, I've created a changelog YAML for you. |
…to model_registry_cluster_state
…to model_registry_cluster_state
|
Just thinking out loud while looking at the PR:
Do consumers have to access this information frequently?
The Elastic Inference Service makes an asynchronous authorization call when the node boots up to determine which default inference endpoints are enabled. Does that cause any issues with this solution? Basically it means we don't know immediately after the node boots up what all the default inference endpoints are. Here's where that call happens: |
For
I think that's fine. That would mean that creating new semantic_text field that depends on this models could fail before the default endpoints are added. |
If it's very infrequently, what's the benefit to move it into the cluster state? Is it because the call to get the whole model is expensive?
Yeah, probably not super likely since for that to happen the
Oh that's an interesting point I hadn't thought of. I think what you're saying is if we revoke access to an model after it was granted previously? Let me ping the EIS team on how we should handle that. |
The call would happen on the master node when updating/creating a mapping so we cannot block the thread to get the model from the index. Today we are lenient and get the model definition in a later stage but now that we want to add options to setup the inner fields we have to know the model early. |
Ah I see. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The approach makes sense to me. I left a couple questions for things that weren't immediately clear to me.
...plugin/inference/src/main/java/org/elasticsearch/xpack/inference/registry/ModelRegistry.java
Show resolved
Hide resolved
...nference/src/main/java/org/elasticsearch/xpack/inference/registry/ModelRegistryMetadata.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everything looks great if you can remove the return_minimal_config parameter from the REST API pls.
...rade/src/javaRestTest/java/org/elasticsearch/xpack/application/InferenceUpgradeTestCase.java
Outdated
Show resolved
Hide resolved
...pgrade/src/javaRestTest/java/org/elasticsearch/xpack/application/ModelRegistryUpgradeIT.java
Outdated
Show resolved
Hide resolved
...plugin/inference/src/main/java/org/elasticsearch/xpack/inference/registry/ModelRegistry.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…to model_registry_cluster_state
When retrieving a default inference endpoint for the first time, the system automatically creates the endpoint. However, unlike the `put inference model` action, the `get` action does not redirect the request to the master node. Since elastic#121106, we rely on the assumption that every model creation (`put model`) must run on the master node, as it modifies the cluster state. However, this assumption led to a bug where the get action tries to store default inference endpoints from a different node. This change resolves the issue by preventing default inference endpoints from being added to the cluster state. These endpoints are not strictly needed there, as they are already reported by inference services upon startup. **Note:** This bug did not prevent the default endpoints from being used, but it caused repeated attempts to store them in the index, resulting in logging errors on every usage.
When retrieving a default inference endpoint for the first time, the system automatically creates the endpoint. However, unlike the `put inference model` action, the `get` action does not redirect the request to the master node. Since #121106, we rely on the assumption that every model creation (`put model`) must run on the master node, as it modifies the cluster state. However, this assumption led to a bug where the get action tries to store default inference endpoints from a different node. This change resolves the issue by preventing default inference endpoints from being added to the cluster state. These endpoints are not strictly needed there, as they are already reported by inference services upon startup. **Note:** This bug did not prevent the default endpoints from being used, but it caused repeated attempts to store them in the index, resulting in logging errors on every usage.
…ic#125242) When retrieving a default inference endpoint for the first time, the system automatically creates the endpoint. However, unlike the `put inference model` action, the `get` action does not redirect the request to the master node. Since elastic#121106, we rely on the assumption that every model creation (`put model`) must run on the master node, as it modifies the cluster state. However, this assumption led to a bug where the get action tries to store default inference endpoints from a different node. This change resolves the issue by preventing default inference endpoints from being added to the cluster state. These endpoints are not strictly needed there, as they are already reported by inference services upon startup. **Note:** This bug did not prevent the default endpoints from being used, but it caused repeated attempts to store them in the index, resulting in logging errors on every usage.
* Add ModelRegistryMetadata to Cluster State (#121106) This commit integrates `MinimalServiceSettings` (introduced in #120560) into the cluster state for all registered models in the `ModelRegistry`. These settings allow consumers to access configuration details without requiring asynchronous calls to retrieve full model configurations. To ensure consistency, the cluster state metadata must remain synchronized with the models in the inference index. If a mismatch is detected during startup, the master node performs an upgrade to load all model settings from the index. * fix test compil * fix serialisation * Exclude Default Inference Endpoints from Cluster State Storage (#125242) When retrieving a default inference endpoint for the first time, the system automatically creates the endpoint. However, unlike the `put inference model` action, the `get` action does not redirect the request to the master node. Since #121106, we rely on the assumption that every model creation (`put model`) must run on the master node, as it modifies the cluster state. However, this assumption led to a bug where the get action tries to store default inference endpoints from a different node. This change resolves the issue by preventing default inference endpoints from being added to the cluster state. These endpoints are not strictly needed there, as they are already reported by inference services upon startup. **Note:** This bug did not prevent the default endpoints from being used, but it caused repeated attempts to store them in the index, resulting in logging errors on every usage.
…ic#125242) When retrieving a default inference endpoint for the first time, the system automatically creates the endpoint. However, unlike the `put inference model` action, the `get` action does not redirect the request to the master node. Since elastic#121106, we rely on the assumption that every model creation (`put model`) must run on the master node, as it modifies the cluster state. However, this assumption led to a bug where the get action tries to store default inference endpoints from a different node. This change resolves the issue by preventing default inference endpoints from being added to the cluster state. These endpoints are not strictly needed there, as they are already reported by inference services upon startup. **Note:** This bug did not prevent the default endpoints from being used, but it caused repeated attempts to store them in the index, resulting in logging errors on every usage.
This commit integrates `MinimalServiceSettings` (introduced in elastic#120560) into the cluster state for all registered models in the `ModelRegistry`. These settings allow consumers to access configuration details without requiring asynchronous calls to retrieve full model configurations. To ensure consistency, the cluster state metadata must remain synchronized with the models in the inference index. If a mismatch is detected during startup, the master node performs an upgrade to load all model settings from the index.
…ic#125242) When retrieving a default inference endpoint for the first time, the system automatically creates the endpoint. However, unlike the `put inference model` action, the `get` action does not redirect the request to the master node. Since elastic#121106, we rely on the assumption that every model creation (`put model`) must run on the master node, as it modifies the cluster state. However, this assumption led to a bug where the get action tries to store default inference endpoints from a different node. This change resolves the issue by preventing default inference endpoints from being added to the cluster state. These endpoints are not strictly needed there, as they are already reported by inference services upon startup. **Note:** This bug did not prevent the default endpoints from being used, but it caused repeated attempts to store them in the index, resulting in logging errors on every usage.
This commit integrates
MinimalServiceSettings(introduced in #120560) into the cluster state for all registered models in theModelRegistry. These settings allow consumers to access configuration details without requiring asynchronous calls to retrieve full model configurations.To ensure consistency, the cluster state metadata must remain synchronized with the models in the inference index. If a mismatch is detected during startup, the master node performs an upgrade to load all model settings from the index.