-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Prevent duplicate source parsing in ShardBulkInferenceActionFilter. #124186
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Pinging @elastic/ml-core (Team:ML) |
|
|
||
| final IndexRequest indexRequest = getIndexRequestOrNull(item.request()); | ||
| var newDocMap = indexRequest.sourceAsMap(); | ||
| var newDocMap = response.source(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This reuses the parsed source from line 511 instead of parsing it again.
Parsing it again causes the full input string to be extracted and stored again, which is another copy of a potentially large byte[]. The savings of this is very similar to Remove matched text from chunks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure we can really call it a savings. With this change we'll keep all the parsed source in memory for the entire batch. That's what we wanted to avoid by re-parsing eagerly.
One possible optimisation here, would be to parse and write the value without keeping the intermediate Map object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With this change there's one less copy of the (potentially long) input string around. In my experiments that's a clear win, because the source doesn't contain anything else.
It comes at the expense of keeping a parsed version of the rest of the source around. This costs some memory indeed, and becomes a memory loss if the rest of the source is larger than the input string. You might also argue that it's a computational win, because the parsed source is cached.
I'm not sure what you mean with:
One possible optimisation here, would be to parse and write the value without keeping the intermediate Map object.
In the end, a parsed version of the source is needed for indexing. If you re-parse the source, you get an additional copy of the input string, which is what I'm trying to prevent.
If you prefer, I could keep the source around only if size(input) > size(source) / 2 or so, because in that case it's (most likely) a memory win, and re-parse it otherwise. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a batch API, we have more than one source at a time. This change would put all the parsed source in memory for the entire duration of the batch inference calls. We're not trying to optimise for the case where the bulk api contains a single document. This is something we can improve but this change alone doesn't save anything memory wise. The parsed source is ephemeral and kept for a single document at a time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The input string is not ephemeral: it's referenced in the FieldInferenceRequest and the FieldInferenceResponse, and therefore arrives here.
Maybe we can instead remove the input string from the FieldInferenceResponse? That shouldn't be necessary, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does the input string has to do with this change? Sorry I am not following
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removing the input from the response is something we forgot to follow up on after moving to the new format:
#118893 (comment)
cc @Mikep86
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I lost track of that one @jimczi, I'll create a task to address this soon. But isn't that change orthogonal to this PR? We still need the parsed source both when generating the inference request (to get the values to perform inference on) and when handling the inference response (to update source).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that change seems orthogonal indeed.
Right now, profiling shows that when you get to the line
var newDocMap = indexRequest.sourceAsMap();
there's already a parsed version of the (potentially long) input string in memory, and this line is generating an additional copy. This PR prevents that, but I agree it also introduces some other memory usage.
I'm investigating right now whether I can prevent it without that side effect. To be continued...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for the optimization!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left one comment, this change would increase memory consumption not reduce it.
No description provided.