-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Update EIS sparse and dense embedding max batch size to 16 #132646
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add maybe a comment with the context why we set it to 16
? I see this could cause some confusion in the future why we went down from 512
to 16
. Otherwise LGTM 🚢
Pinging @elastic/ml-core (Team:ML) |
Hi @jaybcee, I've created a changelog YAML for you. |
💔 Backport failed
You can use sqren/backport to manually backport by running |
💚 All backports created successfully
Questions ?Please refer to the Backport tool documentation |
…132855) (cherry picked from commit 81b4cce) # Conflicts: # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/services/elastic/ElasticInferenceService.java Co-authored-by: Jonathan Buttner <[email protected]>
…32646) (elastic#132855) (cherry picked from commit 81b4cce) # Conflicts: # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/services/elastic/ElasticInferenceService.java Co-authored-by: Jonathan Buttner <[email protected]>
…32646) (elastic#132855) (cherry picked from commit 81b4cce) # Conflicts: # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/services/elastic/ElasticInferenceService.java Co-authored-by: Jonathan Buttner <[email protected]>
In EIS, we've determined that the best batch size at the moment is of size 16, not 512. Updating the maximum batch size to reflect reality. We previously thought it would not necessarily need to be set.
Ref: https://github.com/elastic/search-team/issues/10719