forked from elastic/elasticsearch
-
Notifications
You must be signed in to change notification settings - Fork 0
Seanstory/increase mapping field meta char limit #3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
seanstory
wants to merge
10,000
commits into
main
from
seanstory/increase-mapping-field-meta-char-limit
Closed
Seanstory/increase mapping field meta char limit #3
seanstory
wants to merge
10,000
commits into
main
from
seanstory/increase-mapping-field-meta-char-limit
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
We specify the master node timeout from the REST request to avoid waiting for the task indefinitely. Resolves elastic#120389
These test failures looked like infra/CI blips to me. Closes elastic#124518
…lastic#128917) Part of elastic#124715 and similar to elastic#128476. Different from elastic#128476 in that it takes a "LogicalPlan" approach to running a sub-query, integrating its result back in the "main" LogicalPlan and continuing running the query.
* Update [email protected] * Update resources.yaml * fix: explicitly map system.process.cpu.start_time to date * Update [email protected] * Update [email protected] * Update [email protected]
…ic#129684) In a follow up (elastic#128993) remaining lenient usage of booleans will be deprecated, to eventually remove everything except for a few places requiring lenient parsing by means of Booleans.parseBooleanLenient - which is a wrapper around Boolean.parseBoolean. --------- Co-authored-by: Moritz Mack <[email protected]>
This action solely needs the cluster state, it can run on any node. Since this action is invoked across clusters, we need to be able to (de)serialize requests and responses. We introduce a new `RemoteClusterStateRequest` that wraps the existing `ClusterStateRequest` and implements (de)serialization.
…ic#130716) Remove threat detection example
Add verification for LocalLogical plan The verification is skipped if there is remote enrich, similar to how it is skipped for LocalPhysical plan optimization. The skip only happens for LocalLogical and LocalPhysical plan optimizers.
* Add filtering for kNN vector indexer test scenarios * [CI] Auto commit changes from spotless --------- Co-authored-by: elasticsearchmachine <[email protected]>
Cleanup tracing header name constants
This commit fixes the Int7uScorerBenchmarkTests for running on Java 21, since scoring with heap segments is only supported on Java 22 and greater.
…st {p0=mtermvectors/10_basic/Tests catching other exceptions per item} elastic#122414
Unmute yaml test fixed by elastic#130732 Closes elastic#130626, elastic#130661
Fixes a bug during field loading where we could double-close blocks if we failed to allocate memory during the un-shuffling portion of field loading from single segments. Unit test incoming in the followup. Closes elastic#130426 Closes elastic#130790 Closes elastic#130791 Closes elastic#130792 Closes elastic#130793 Closes elastic#130270 Closes elastic#130788 Closes elastic#130122 Closes elastic#130827
* Adding embedding type * Adding more tests and cleaning up
For most of the usages of these methods, it made more sense to return a `ProjectMetadata` instead of a `ClusterState`. We also don't need to specify a specific project ID; generating a random one inside the helper method saves some boilerplate code.
We should not build the sorted structure for the ordinal grouping operator if the requested position is larger than maxGroupId. This situation occurs with nulls. We should benchmark the ordinal blocks and consider removing the ordinal grouping operator if performance is similar; otherwise, we need to integrate this operator with GroupingAggregatorFunctionTestCase. Relates elastic#130576
… instead of interacting with doc values api directly. (elastic#130854) This pulls elastic#130845 into the serverless fix branch for patch deployment. Original description: Change match_only_text's value fetcher to use SortedBinaryDocValues instead of interacting with doc values api directly. This way, via field data abstraction, the right doc values type is used, and the right conversions happen. Values of all field types will get converted to strings. Co-authored-by: Martijn van Groningen <[email protected]>
…ializationPreMultiProject elastic#130872
This change modifies reindex behavior to always include vector fields, even if the target index omits embeddings from _source. This prepares for scenarios where embeddings may be automatically excluded (elastic#130382).
* Put shards failure under a cap flag
…DisruptionIT testDataStreamLifecycleDownsampleRollingRestart elastic#131394
With the ordinal grouping operator removed in elastic#131133, this PR removes the corresponding code path in the grouping aggregator function, as it is no longer needed. Relates elastic#131133
The new attribute generated by MV_EXPAND should remain in the original position. The projection added by ProjectAwayColumns does not respect the original order of attributes. Make ProjectAwayColumns respect the order of attributes to fix this.
* ES|QL categorize options * refactor options * fix serialization * polish * add verfications * better test coverage + polish code * better test coverage + polish code
This PR migrates legacy rest tests in the x-pack autoscaling module
It's already part of the path parts, it's not useful to duplicate it in query parameters.
* Add Azure AI Rerank support * address comments * address comments * refactor azure ai studio service * update rerank task settings test * add provider for rerank
Adds the `includeDiskInfo` parameter to the `cluster/allocation/explain` `toString()` method, and adds tests.
Also add test to ensure the file has at least one entry for each region so that it is easy to spot missing regions in future upgrades. Relates: elastic#131050 Resolves: elastic#131392
* Refactoring google gemini streaming error handling * Updating comments
* To prevent an implicit grant-all if storing node homes inside the Java temp dir, the temporary folder of ESTestCase is configured separately from the Java temp dir in internalClusterTests (by means of the system property tempDir, see TestRuleTemporaryFilesCleanup) * Move ReloadingDatabasesWhilePerformingGeoLookupsIT from internalClusterTest to test, file permissions in internalClusterTest are stricter on the lucene tempDir
Correct response which had swapped "skipped" and "failed" shard counts.
…the centroids file (elastic#131421)
* fix boosting for knn * Fixing for match query * fixing for match subquery * fix for sparse vector query boost * fix linting issues * Update docs/changelog/129282.yaml * update changelog * Copy constructor with match query * util function to create sparseVectorBuilder for sparse query * util function for knn query to support boost * adding unit tests for all intercepted query terms * Adding yaml test for match,sparse, and knn * Adding queryname support for nested query * fix code styles * Fix failed yaml tests * Update docs/changelog/129282.yaml * update yaml tests to expand test scenarios * Updating knn to copy constructor * adding yaml tests for multiple indices * refactoring match query to adjust boost and queryname and move to copy constructor * refactoring sparse query to adjust boost and queryname and move to copy constructor * [CI] Auto commit changes from spotless * Refactor sparse vector to adjust boost and queryname in the top level * Refactor knn vector to adjust boost and queryname in the top level * fix knn combined query * fix unit tests * fix lint issues * remove unused code * Update inference feature name * Remove double boosting issue from match * Fix double boosting in match test yaml file * move to bool level for match semantic boost * fix double boosting for sparse vector * fix double boosting for sparse vector in yaml test * fix knn combined query * fix knn combined query * fix sparse combined query * fix knn yaml test for combined query * refactoring unit tests * linting * fix match query unit test * adding copy constructor for match query * refactor copy match builder to intercepter * [CI] Auto commit changes from spotless * fix unit tests * update yaml tests * fix match yaml test * fix yaml tests with 4 digits error margin * unit tests are now more randomized --------- Co-authored-by: Elastic Machine <[email protected]> Co-authored-by: elasticsearchmachine <[email protected]>
When the Trained Model has been deployed through the Inference Endpoint API, it can only be updated using the Inference Endpoint API. When the Trained Model has been deployed and then attached to an Inference Endpoint, it can only be updated using the Trained Model API. Fix elastic#129999 Co-authored-by: elasticsearchmachine <[email protected]> Co-authored-by: David Kyle <[email protected]>
In elastic#131314 we fixed match_only_text fields with ignore_above keyword multi-fields in the case that the keyword multi-field is stored. However, the issue is still present if the keyword field is not stored, but instead has doc values. This patch fixes that case.
Although blocks/vectors are immutable and safe to share between threads, their references are currently not thread-safe, which can lead to data races. Previously, blocks/vectors were exclusively owned by a single thread, but this is no longer always the case with InlineJoin. We should consider switching to AbstractRefCounted, which is thread-safe, and benchmark it with many-fields use cases to ensure there is no performance regression. As a temporary solution, this change clones the values block in InlineJoin until thread-safe blocks/vectors are available.
…121914)" (elastic#131452) This reverts commit a6f0f6f.
…129108) This commit adds support for implicit casting of aggregate_metric_double when present with other numerics for a limited set of aggregation functions: - Max / MaxOverTime - Min / MinOverTime - Sum / SumOverTime - Count / CountOverTime - Avg / AvgOverTime Attempting to use fields mapped to aggregate_metric_double in one index but some other numeric in another index in any other context will still require explicit casting with ToAggregateMetricDouble
I accidentally broke recall on flush by allowing vectors to be double quantized. Additionally, we shouldn't use the first vector as a centroid, this can harm recall significantly when there is just one centroid. recall before this change: ``` index_name index_type num_docs index_time(ms) force_merge_time(ms) num_segments ------------------------------------- ---------- -------- -------------- -------------------- ------------ corpus-dbpedia-entity-E5-small-0.fvec ivf 1000000 25820 0 14 corpus-dbpedia-entity-E5-small-0.fvec ivf 1000000 0 41693 0 index_name index_type n_probe latency(ms) net_cpu_time(ms) avg_cpu_count QPS recall visited filter_selectivity ------------------------------------- ---------- ------- ----------- ---------------- ------------- ------ ------ --------- ------------------ corpus-dbpedia-entity-E5-small-0.fvec ivf 50 13.05 0.00 0.00 76.61 0.63 285267.44 1.00 corpus-dbpedia-entity-E5-small-0.fvec ivf 150 31.92 0.00 0.00 31.33 0.68 629033.22 1.00 corpus-dbpedia-entity-E5-small-0.fvec ivf 200 34.79 0.00 0.00 28.74 0.69 679699.13 1.00 corpus-dbpedia-entity-E5-small-0.fvec ivf 500 39.40 0.00 0.00 25.38 0.71 794375.05 1.00 corpus-dbpedia-entity-E5-small-0.fvec ivf 1000 45.99 0.00 0.00 21.74 0.72 940493.52 1.00 corpus-dbpedia-entity-E5-small-0.fvec ivf 50 1.52 0.00 0.00 655.74 0.74 24201.82 1.00 corpus-dbpedia-entity-E5-small-0.fvec ivf 150 2.94 0.00 0.00 340.43 0.85 67943.31 1.00 corpus-dbpedia-entity-E5-small-0.fvec ivf 200 3.81 0.00 0.00 262.81 0.87 89575.99 1.00 corpus-dbpedia-entity-E5-small-0.fvec ivf 500 7.67 0.00 0.00 130.38 0.93 213586.44 1.00 corpus-dbpedia-entity-E5-small-0.fvec ivf 1000 14.85 0.00 0.00 67.33 0.96 402628.11 1.00 ``` With this fix: ``` index_name index_type num_docs index_time(ms) force_merge_time(ms) num_segments ------------------------------------- ---------- -------- -------------- -------------------- ------------ corpus-dbpedia-entity-E5-small-0.fvec ivf 1000000 25304 0 15 corpus-dbpedia-entity-E5-small-0.fvec ivf 1000000 0 42110 0 index_name index_type n_probe latency(ms) net_cpu_time(ms) avg_cpu_count QPS recall visited filter_selectivity ------------------------------------- ---------- ------- ----------- ---------------- ------------- ------ ------ --------- ------------------ corpus-dbpedia-entity-E5-small-0.fvec ivf 50 12.63 0.00 0.00 79.18 0.89 285527.22 1.00 corpus-dbpedia-entity-E5-small-0.fvec ivf 150 32.49 0.00 0.00 30.77 0.94 619783.37 1.00 corpus-dbpedia-entity-E5-small-0.fvec ivf 200 35.46 0.00 0.00 28.20 0.95 667903.47 1.00 corpus-dbpedia-entity-E5-small-0.fvec ivf 500 40.38 0.00 0.00 24.76 0.97 781959.74 1.00 corpus-dbpedia-entity-E5-small-0.fvec ivf 1000 48.62 0.00 0.00 20.57 0.98 931017.40 1.00 corpus-dbpedia-entity-E5-small-0.fvec ivf 50 1.55 0.00 0.00 643.09 0.74 23595.57 1.00 corpus-dbpedia-entity-E5-small-0.fvec ivf 150 2.98 0.00 0.00 335.29 0.85 66299.43 1.00 corpus-dbpedia-entity-E5-small-0.fvec ivf 200 3.81 0.00 0.00 262.64 0.87 87416.15 1.00 corpus-dbpedia-entity-E5-small-0.fvec ivf 500 8.80 0.00 0.00 113.64 0.93 209061.37 1.00 corpus-dbpedia-entity-E5-small-0.fvec ivf 1000 16.18 0.00 0.00 61.81 0.96 394906.29 1.00 ```
This concept is complicated. Closes elastic#128991 Co-authored-by: Larisa Motova <[email protected]> Co-authored-by: Liam Thompson <[email protected]>
Documentation preview: |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
None yet
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
gradle check
?