-
Notifications
You must be signed in to change notification settings - Fork 25.6k
[ML] Refactor Inference API chat completion error handling #132475
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ML] Refactor Inference API chat completion error handling #132475
Conversation
…tor-streaming-openai
…tor-streaming-openai
|
|
||
| public class UnifiedChatCompletionErrorResponse extends ErrorResponse { | ||
|
|
||
| private static final ConstructingObjectParser<Optional<UnifiedChatCompletionErrorResponse>, Void> CONSTRUCTING_OBJECT_PARSER = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These new lines were taken from StreamingErrorResponse. I consolidated the classes.
| * TODO: {@link ErrorMessageResponseEntity} is nearly identical to this, but doesn't parse as many fields. We must remove the duplication. | ||
| */ | ||
| public class StreamingErrorResponse extends ErrorResponse { | ||
| private static final ConstructingObjectParser<Optional<ErrorResponse>, Void> ERROR_PARSER = new ConstructingObjectParser<>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved the parsing functionality to UnifiedChatCompletionErrorResponse.
| * @param parseFunction The function to parse the response. | ||
| */ | ||
| public MistralUnifiedChatCompletionResponseHandler(String requestType, ResponseParser parseFunction) { | ||
| super(requestType, parseFunction, ErrorResponse::fromResponse); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MistralErrorResponseHelper::fromResponse basically does the same thing as ErrorResponse::fromResponse aka convert the response into a string.
| "error": { | ||
| "code": "stream_error", | ||
| "message": "Received an error response for request from inference entity id [id].\ | ||
| Error message: [400: Invalid value: Model 'llama3.12:3b' not found]", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason this changed is because I decided to just convert the whole error into a string for all cases for llama.
| } | ||
| } | ||
|
|
||
| private static String code(@Nullable String code, RestStatus status) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was an oversight from the previous PR. The OpenAI implementation either uses the provided code or the rest status code.
|
Pinging @elastic/ml-core (Team:ML) |
| public class MistralErrorResponseHelper { | ||
| private static final String MISTRAL_ERROR = "mistral_error"; | ||
|
|
||
| public static final UnifiedChatCompletionErrorParserContract ERROR_PARSER = UnifiedChatCompletionErrorResponseUtils |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might just be easier to inline this into MistralUnifiedChatCompletionResponseHandler and have the constructor call ERROR_PARSER::parse directly, we can take the above comment and paste it into the unit tests so we don't lose that context
davidkyle
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
|
||
| public class UnifiedChatCompletionErrorResponse extends ErrorResponse { | ||
|
|
||
| private static final ConstructingObjectParser<Optional<UnifiedChatCompletionErrorResponse>, Void> CONSTRUCTING_OBJECT_PARSER = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit rename CONSTRUCTING_OBJECT_PARSER -> ERROR_OBJECT_PARSER
| * Standard error response parser. This can be overridden for those subclasses that | ||
| * have a different error response structure. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit static methods cannot be overridden
| * Standard error response parser. This can be overridden for those subclasses that | |
| * have a different error response structure. | |
| * Standard error response parser. |
| import static org.mockito.Mockito.mock; | ||
|
|
||
| public class LlamaErrorResponseTests extends ESTestCase { | ||
| public class MistralErrorResponseHelperTests extends ESTestCase { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you mean to duplicate this file or rename it? Is LlamaErrorResponseTests.java now missing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think git flagged this wrong or something. LlamaErrorResponse no longer exists (I inlined the contents in LlamaChatCompletionResponseHandler).
I'm going to do what Pat suggested actually and MistralErrorResponseHelperTests will be removed shortly too.
This PR is a follow on for #131316
It takes the same approach and propagates the changes to the rest of the chat completion handlers.
This tries to remove a lot of the duplication.