-
Notifications
You must be signed in to change notification settings - Fork 25.6k
[8.x] Metrics for incremental bulk splits (#116765) #117275
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…dCredentialsRestIT testFirstTimeSetupWithElasticsearchSettings elastic#116286
…lastic#116304) (cherry picked from commit 8a98844)
…estEveryActionIsEitherOperatorOnlyOrNonOperator elastic#102992
…atedSettingsReturnWarnings elastic#108628
…hAndRelocateConcurrentlyRandomReplicas elastic#116145
…gorize.Categorize SYNC} elastic#113054
…) (elastic#116285) * Add support for bitwise inner-product in painless (elastic#116082) This adds bitwise inner product to painless. The idea here is: - For two bit arrays, which we determine to be a byte array whose dimensions match `dense_vector.dim/8`, we simply return bitwise `&` - For a stored bit array (remember, with `dense_vector.dim/8` bytes), sum up the provided byte or float array using the bit array as a mask. This is effectively supporting asynchronous quantization. A prime example of how this works is: https://github.com/cohere-ai/BinaryVectorDB Basically, you do your initial search against the binary space and then rerank with a differently quantized vector allowing for more information without additional storage space. closes: elastic#111232 * removing unnecessary task adjustment --------- Co-authored-by: Elastic Machine <[email protected]>
…lastic#116288) * Align dot prefix validation with Serverless (elastic#116266) This aligns the deprecation warnings for on-prem dot-prefixed indices to be the same as the Serverless validation. It adds exemptions for the `.entities…` indices, and makes the list a dynamic setting. (cherry picked from commit 72aa17a) * Fix compilation --------- Co-authored-by: Elastic Machine <[email protected]>
We are currently holding to fields to extract values, this commit makes them abstract methods so we don't use any heap.
…astic#116343) * Clarify that MSSQL supports only SQL Server auth * typo
(cherry picked from commit f88f68d)
…c#113713) (elastic#116347) * Adding inference endpoint validation for AzureAiStudioService * Run spotlessApple * Update docs/changelog/113713.yaml * Remove isInClusterService from InferenceService * Run spotless apply --------- Co-authored-by: Elastic Machine <[email protected]>
…st_exception (elastic#116274) (elastic#116356) * validate agg filter's type is boolean (cherry picked from commit 0e044d7)
…ence/40_semantic_text_query/Query a field that uses the default ELSER 2 endpoint} elastic#114376
…c#116367) This fixes a test, actually in serverless Elasticsearch, that gets duplicate warnings. We'd like not to emit these duplicate warnings, but at this point it isn't worth it. So, for now, in some tests we allow duplicate warnings. In most of our tests we do not allow duplicate warnings so that we don't make *more* duplicate warnings without thinking about it.
…testHasPrivilegesOtherThanIndex elastic#116376
elastic#115511) (elastic#116316) A long desired balance computation could delay a newly created index shard from being assigned since first the computation has to finish for the assignments to be published and the shards getting assigned. With this change we add a new setting which allows setting a maximum time for a computation in case there are unassigned primary shards. Note that this is similar to how a new cluster state causes early publishing of the desired balance. Closes ES-9616 Co-authored-by: Elastic Machine <[email protected]>
…lastic#116381) * Better sizing BytesRef for Strings in Queries (elastic#115655) * Better sizing BytesRefs for Strings in Queries * Update docs/changelog/115655.yaml * iter * added test * iter * extracted method * iter --------- Co-authored-by: Elastic Machine <[email protected]> (cherry picked from commit 9ebe95a) * iter
…ic#116248) Fixes elastic#114970 Added the warnings in the `RemoveStatsOverride` LogicalPlan rule, which is the same one that's removing the duplicates. Also, fixed the groupings parser, which was assigning, to each stats grouping field, the source of the full "grouping context" instead. Without this fix, the warnings on groupings would, in some cases, say something like `Line 2:10: Field 'x' shadowed by field at line 2:10`. As there are already tests for these cases, I'm requiring the capability on them, and updating their warnings expectations. ## Notes I'm treating this as an enhancement instead of a bug. As there's existing logic removing duplicates, I'll guess this was decided at some point (Decision that may apply more or less nowadays). And still, solving it this way is less dangerous and doesn't break compatibility. Co-authored-by: Elastic Machine <[email protected]>
…uckets (elastic#116329) (elastic#116393) Related with elastic#88128 This PR pretends to reduce the potential OOMs received when building internal aggregations.
… (elastic#116412) (cherry picked from commit c42b1ef)
…116410) This commit shares a unique instance between all InternalTopMetrics instances.
This fixes sorts containing the a `_source` field. It can use the standard encoder for `BytesRef`s. You can't sort *by* a `_source` field, but that doesn't really make sense ayway.
Adds a test that always fails on one of the data nodes and makes sure this comes back as a failure. When we build support for partial results we can use this test to simulate it.
* Esql Enable Date Nanos (elastic#117080) This enables date nanos support as tech preview. Basic operations, like reading values, binary comparisons, and functions that don't care about type should work, but some functions are not yet supported. Most notably, Bucket is not yet supported, although Date_Trunc is and can be used for grouping. See the docs for the full list of limitations. relates to elastic#109352 * Skip CATEGORIZE tests outside snapshot --------- Co-authored-by: Nik Everett <[email protected]>
* Was using byte position for end of offset, but it seems like using char position is correct * Update docs/changelog/116358.yaml * Update UnigramTokenizer.java --------- Co-authored-by: Elastic Machine <[email protected]>
…zeWithinAggregations elastic#116856
elastic#117203) This adds `maxSim` functions, specifically dotProduct and InvHamming. Why these two you might ask? Well, they are the best approximations of whats possible with Col* late interaction type models. Effectively, you want a similarity metric where "greater == better". Regular `hamming` isn't exactly that, but inverting that (just like our `element_type: bit` index for dense_vectors), is a nice approximation with bit vectors and multi-vector scoring. Then, of course, dotProduct is another usage. We will allow dot-product between like elements (bytes -> bytes, floats -> floats) and of course, allow `floats -> bit`, where the stored `bit` elements are applied as a "mask" over the float queries. This allows for some nice asymmetric interactions. This is all behind a feature flag, and I need to write a mountain of docs in a separate PR.
elastic#116998) (elastic#117215) This change loads all the modules and creates the module layers for plugins prior to entitlement checking during the 2nd phase of bootstrap initialization. This will allow us to know what modules exist for both validation and checking prior to actually loading any plugin classes (in a follow up change). There are now two classes: PluginsLoader which does the module loading and layer creation PluginsService which uses a PluginsLoader to create the main plugin classes and start the plugins
No need to have an `ActionType<>` here since we never register this as an action the `Client` can invoke. Also no need to use a dummy constructor parameter just to trick the injector into instantiating it, we can instantiate it ourselves like we do with all other subsidiary transport-only actions. Also fixes the parent task so the remote action is a child of the local action rather than a sibling.
TODO: Verify what we miss in out automation
…ions (elastic#117019) (elastic#117247) checks periodically the real memory circuit breaker when allocating objects.
) This fixes the off-by-one error of the column position in some of the error messages. (cherry picked from commit 21f206b)
…17189) (elastic#117254) * Fix deberta tokenizer bug caused by bug in normalizer which caused offesets to be negative * Update docs/changelog/117189.yaml
…117184) (elastic#117262) Add tests on use of grouping functions in agg filters: check that reusing the BUCKET expression from grouping is allowed, but no other variation. Related: elastic#115521 (cherry picked from commit fefa0f0)
Add metrics to track incremental bulk request splits due to indexing pressure. Resolves ES-9612
Documentation preview: |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This will backport the following commit from main to 8.x:
Metrics for incremental bulk splits #116765