-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Update sparse_vector field mapping to include default setting for token pruning #129089
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update sparse_vector field mapping to include default setting for token pruning #129089
Conversation
|
Hi @markjhoy, I've created a changelog YAML for you. |
|
Note - due to the scope of the changes (and especially for the transport and index versions) - this will require a manual backport to 8.19 |
|
Pinging @elastic/search-eng (Team:SearchOrg) |
|
Pinging @elastic/search-relevance (Team:Search - Relevance) |
|
Pinging @elastic/es-search-foundations (Team:Search Foundations) |
|
buildkite test this |
💔 Backport failed
You can use sqren/backport to manually backport by running |
…en pruning (elastic#129089) * Initial checkin of refactored index_options code * [CI] Auto commit changes from spotless * initial unit testing * complete unit tests; add yaml tests * [CI] Auto commit changes from spotless * register test feature for sparse vector * Update docs/changelog/129089.yaml * update changelog * add docs * explicit set default index_options if null * [CI] Auto commit changes from spotless * update yaml tests; update docs * fix yaml tests * readd auth for teardown * only serialize index options if not default * [CI] Auto commit changes from spotless * serialization refactor; pass index version around * [CI] Auto commit changes from spotless * fix transport versions merge * fix up docs * [CI] Auto commit changes from spotless * fix docs; add include_defaults unit and yaml test * [CI] Auto commit changes from spotless * override getIndexReaderManager for SemanticQueryBuilderTests * [CI] Auto commit changes from spotless * cleanup mapper/builder/tests; index vers. in type still need to refactor / clean YAML tests * [CI] Auto commit changes from spotless * cleanups to mapper tests for clarity * [CI] Auto commit changes from spotless * move feature into mappers; fix yaml tests * cleanups; add comments; remove redundant test * [CI] Auto commit changes from spotless * escape more periods in the YAML tests * cleanup mapper and type tests * [CI] Auto commit changes from spotless * rename mapping for previous index test * set explicit number of shards for yaml test --------- Co-authored-by: elasticsearchmachine <[email protected]> Co-authored-by: Kathleen DeRusso <[email protected]> (cherry picked from commit a671505) # Conflicts: # docs/reference/elasticsearch/mapping-reference/sparse-vector.md # server/src/main/java/org/elasticsearch/TransportVersions.java # server/src/main/java/org/elasticsearch/index/IndexVersions.java # server/src/main/java/org/elasticsearch/index/mapper/MapperFeatures.java # server/src/test/java/org/elasticsearch/index/mapper/vectors/SparseVectorFieldMapperTests.java # x-pack/plugin/inference/src/test/java/org/elasticsearch/xpack/inference/queries/SemanticQueryBuilderTests.java
💚 All backports created successfully
Questions ?Please refer to the Backport tool documentation |
…for token pruning (#129089) (#129890) * Update sparse_vector field mapping to include default setting for token pruning (#129089) * Initial checkin of refactored index_options code * [CI] Auto commit changes from spotless * initial unit testing * complete unit tests; add yaml tests * [CI] Auto commit changes from spotless * register test feature for sparse vector * Update docs/changelog/129089.yaml * update changelog * add docs * explicit set default index_options if null * [CI] Auto commit changes from spotless * update yaml tests; update docs * fix yaml tests * readd auth for teardown * only serialize index options if not default * [CI] Auto commit changes from spotless * serialization refactor; pass index version around * [CI] Auto commit changes from spotless * fix transport versions merge * fix up docs * [CI] Auto commit changes from spotless * fix docs; add include_defaults unit and yaml test * [CI] Auto commit changes from spotless * override getIndexReaderManager for SemanticQueryBuilderTests * [CI] Auto commit changes from spotless * cleanup mapper/builder/tests; index vers. in type still need to refactor / clean YAML tests * [CI] Auto commit changes from spotless * cleanups to mapper tests for clarity * [CI] Auto commit changes from spotless * move feature into mappers; fix yaml tests * cleanups; add comments; remove redundant test * [CI] Auto commit changes from spotless * escape more periods in the YAML tests * cleanup mapper and type tests * [CI] Auto commit changes from spotless * rename mapping for previous index test * set explicit number of shards for yaml test --------- Co-authored-by: elasticsearchmachine <[email protected]> Co-authored-by: Kathleen DeRusso <[email protected]> (cherry picked from commit a671505) # Conflicts: # docs/reference/elasticsearch/mapping-reference/sparse-vector.md # server/src/main/java/org/elasticsearch/TransportVersions.java # server/src/main/java/org/elasticsearch/index/IndexVersions.java # server/src/main/java/org/elasticsearch/index/mapper/MapperFeatures.java # server/src/test/java/org/elasticsearch/index/mapper/vectors/SparseVectorFieldMapperTests.java # x-pack/plugin/inference/src/test/java/org/elasticsearch/xpack/inference/queries/SemanticQueryBuilderTests.java * Update changelog for version * [CI] Auto commit changes from spotless * Update docs to replace 9.1 with 8.19 * Rename 129089.yaml to 129890.yaml * proper asciidocs; cleanups * remove doc preview labels; cleanup test index ver. * clean up docs * add sparse vector token pruning tag --------- Co-authored-by: elasticsearchmachine <[email protected]>
…en pruning (elastic#129089) * Initial checkin of refactored index_options code * [CI] Auto commit changes from spotless * initial unit testing * complete unit tests; add yaml tests * [CI] Auto commit changes from spotless * register test feature for sparse vector * Update docs/changelog/129089.yaml * update changelog * add docs * explicit set default index_options if null * [CI] Auto commit changes from spotless * update yaml tests; update docs * fix yaml tests * readd auth for teardown * only serialize index options if not default * [CI] Auto commit changes from spotless * serialization refactor; pass index version around * [CI] Auto commit changes from spotless * fix transport versions merge * fix up docs * [CI] Auto commit changes from spotless * fix docs; add include_defaults unit and yaml test * [CI] Auto commit changes from spotless * override getIndexReaderManager for SemanticQueryBuilderTests * [CI] Auto commit changes from spotless * cleanup mapper/builder/tests; index vers. in type still need to refactor / clean YAML tests * [CI] Auto commit changes from spotless * cleanups to mapper tests for clarity * [CI] Auto commit changes from spotless * move feature into mappers; fix yaml tests * cleanups; add comments; remove redundant test * [CI] Auto commit changes from spotless * escape more periods in the YAML tests * cleanup mapper and type tests * [CI] Auto commit changes from spotless * rename mapping for previous index test * set explicit number of shards for yaml test --------- Co-authored-by: elasticsearchmachine <[email protected]> Co-authored-by: Kathleen DeRusso <[email protected]>
Updates the
SparseVectorFieldMappertype to include index options for pruning tokens and associated configuration values.Before this update, token pruning for sparse vector types is only available via the query (see parameters for the sparse vector query ).
With this PR, by default, any new indices with a
sparse_vectorfield type will by default have token pruning turned on (previous indices that may have hadsparse_vectorfields that exist before this update will still keep pruning turned off by default). Anysparse_vectorqueries that have explicit pruning options will still override the index defaults if they are set up.Example: