-
Notifications
You must be signed in to change notification settings - Fork 25.6k
[Inference API] Propagate product use case http header to EIS #124025
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Inference API] Propagate product use case http header to EIS #124025
Conversation
…ase-header-to-eis' into read-and-propagate-product-use-case-header-to-eis
…ase-header-to-eis' into read-and-propagate-product-use-case-header-to-eis
|
Hi @timgrein, I've created a changelog YAML for you. |
jonathan-buttner
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work! Left a couple comments.
| this(in.readString()); | ||
| } | ||
|
|
||
| public static InferenceContext empty() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about we create a static instance that way we don't create multiple empty ones? Something like this:
public static final InferenceContext EMPTY_INSTANCE = new InferenceContext("");
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point! Adjusted with Replace InferenceContext.empty() with InferenceContext.EMPTY_INSTANCE
x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/inference/InferenceContext.java
Outdated
Show resolved
Hide resolved
| public static final TransportVersion INCLUDE_INDEX_MODE_IN_GET_DATA_STREAM = def(9_023_0_00); | ||
| public static final TransportVersion MAX_OPERATION_SIZE_REJECTIONS_ADDED = def(9_024_0_00); | ||
| public static final TransportVersion RETRY_ILM_ASYNC_ACTION_REQUIRE_ERROR = def(9_025_0_00); | ||
| public static final TransportVersion INFERENCE_CONTEXT = def(9_026_0_00); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a reminder, if we do want to backport to 8.19 we'll need a TransportVersion for 8.x
for example: COHERE_BIT_EMBEDDING_TYPE_SUPPORT_ADDED_BACKPORT_8_X
We'll also need to change the onAfter() check. Here's an example:
https://github.com/elastic/elasticsearch/blob/main/x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/services/cohere/embeddings/CohereEmbeddingType.java#L131-L132
The code in 8.x will look different too (since the 9.x transport version won't exist): https://github.com/elastic/elasticsearch/blob/8.x/x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/services/cohere/embeddings/CohereEmbeddingType.java#L131
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation and the code examples.
Adjusted with Add TransportVersion for 8_X.
In the backport I would then need to replace TransportVersions.INFERENCE_CONTEXT with TransportVersions.INFERENCE_CONTEXT_8_X, right?
|
|
||
| var context = request.getContext(); | ||
| if (Objects.nonNull(context)) { | ||
| threadPool.getThreadContext().putHeader(InferencePlugin.X_ELASTIC_PRODUCT_USE_CASE_HTTP_HEADER, context.productUseCase()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, another option would be to pass this through the infer calls. That'll probably change a ton of files though 🤔 . It does seem a bit strange to parse it out of the header and then put it back in a header when we already have it hmm.
Drilling through a bunch of function calls isn't great either though.
@prwhelan @dan-rubinstein @davidkyle what do you think?
I wonder if we should create a context/components object that is like a catch all for these types of changes. That way in the future we just add it to that class's definition and we don't have to drill it through a ton of places.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer that, though I vote we do that in a separate change since this one is already quite large
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, another option would be to pass this through the infer calls. That'll probably change a ton of files though 🤔 .
Yeah, that was basically my reasoning I've put in the PR comment "I was hesitant to pass the InferenceContext through the base methods (doInfer, doUnifiedCompletionInfer etc.) as this would imply changes in all integrations making this PR even larger as it already is, especially considering what it does (passing one value)". But nevertheless I also think it's cleaner to pass it through the methods, as it's then obvious from the signature that there's a context object.
I'd prefer that, though I vote we do that in a separate change since this one is already quite large
I would also prefer this and keep it as is for now 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would also prefer this and keep it as is for now 👍
Sounds good Tim!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually could you add a TODO above the line to as a reminder for us to move it to being passed through the various method calls?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes of course! Added with Add TODO to remove temporary product use case propagation
| } | ||
|
|
||
| // We always get the first value as the header doesn't allow multiple values | ||
| return productUseCaseHeaders.getFirst(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we do backport this, it's going to complain about getFirst not being a call in 8.19. Might be worth just leave it as get(0) to avoid all that lol (I've run into it many times from experience).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch 🎣 I've also ran into this in the past, adjusted with Use .get(0) instead of getFirst to avoid compilation errors in backport.
…ase-header-to-eis' into read-and-propagate-product-use-case-header-to-eis
…inference/InferenceContext.java Co-authored-by: Jonathan Buttner <[email protected]>
jonathan-buttner
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the changes! If you could add a TODO for passing the inference context around that'd be great!
|
|
||
| var context = request.getContext(); | ||
| if (Objects.nonNull(context)) { | ||
| threadPool.getThreadContext().putHeader(InferencePlugin.X_ELASTIC_PRODUCT_USE_CASE_HTTP_HEADER, context.productUseCase()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually could you add a TODO above the line to as a reminder for us to move it to being passed through the various method calls?
… the same as for others.
…ase-header-to-eis' into read-and-propagate-product-use-case-header-to-eis
💔 Backport failed
You can use sqren/backport to manually backport by running |
💚 All backports created successfully
Questions ?Please refer to the Backport tool documentation |
This PR reads the
X-elastic-product-use-caseheader containing the product use case EIS is called with (Assistants etc.). I had to use a workaround to propagate the information through the transport layer: I set the header explicitly in theThreadContextas we would lose it otherwise when theInferenceActionProxymakes an internal call toInferenceActionorUnifiedCompletionAction(thread context gets stashed and then reconstructed losing most headers; as this is specific to the inference API/EIS we shouldn't add it toTask.HEADERS_TO_COPY- had this discussion with some ES devs).I was hesitant to pass the
InferenceContextthrough the base methods (doInfer,doUnifiedCompletionInferetc.) as this would imply changes in all integrations making this PR even larger as it already is, especially considering what it does (passing one value). If we feel like that the product use case information is useful for all integrations (which it probably is) we can still follow-up on this initial change. For now I want to keep it isolated for EIS.