-
Notifications
You must be signed in to change notification settings - Fork 25.5k
Add object param for keeping synthetic source #113690
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add object param for keeping synthetic source #113690
Conversation
Hi @kkrik-es, I've created a changelog YAML for you. |
…am' into synthetic-source/keep-object-param
# Conflicts: # server/src/main/java/org/elasticsearch/index/mapper/ObjectMapper.java # server/src/test/java/org/elasticsearch/index/mapper/DocumentParserTests.java # server/src/test/java/org/elasticsearch/index/mapper/ObjectMapperTests.java
Pinging @elastic/es-storage-engine (Team:StorageEngine) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - I left a comment about what we say we guarantee with synthetic source keep arrays.
====== Keeping the origical source | ||
|
||
It is possible to record the original source of an object or field, at extra storage cost, using param | ||
`synthetic_source_keep`. The default value is `none`; setting it to `arrays` leads to storing the original source for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we shouldn't guarantee that we store original source when synthetic_source_keep
is set to arrays
is enabled? Maybe we we should just guarantee the array ordering is guaranteed? This gives us better ways to reduce the overhead later on without breaking bwc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we need to main bwc with things like field: [1, 2, [3], [[4, [5]]], ["5"], 6] , then that makes things more difficult. Ideally we would synthesize this as: field: [1, 2, 3, 5, 5, 6] (assuming this is a number field).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But this is how it's implemented right now. Are you suggesting that we don't document the current behavior? I think it'll be a breaking change if we change it to return [1, 2, 3, 5, 5, 6]
in the example above, whether we document it or not..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we can be explicit that we don't guarantee the exact form.. is this what you have in mind?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we can be explicit that we don't guarantee the exact form.. is this what you have in mind?
Apologies, yes, this is what I mean.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated the wording to reflect that this may change, ptal.
subobjects auto: | ||
- requires: | ||
cluster_features: ["mapper.subobjects_auto"] | ||
cluster_features: ["mapper.subobjects_auto", "mapper.bwc_workaround_9_0"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would adding gte_v8.16.0
as cluster feature avoid adding the new mapper.bwc_workaround_9_0
cluster feature?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If added in skip, it will disable the test in main. If added here, it won't fix the test failures.
This is controlled through param `synthetic_source_keep` with the following option: | ||
|
||
- `none`: synthetic source diverges from the original source as described above (default). | ||
- `arrays`: arrays of the corresponding field or object preserve the original element ordering and duplicate elements. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
💔 Backport failedThe backport operation could not be completed due to the following error:
You can use sqren/backport to manually backport by running |
💚 All backports created successfully
Questions ?Please refer to the Backport tool documentation |
* Add object param for keeping synthetic source * Update docs/changelog/113690.yaml * fix merging * add tests * merge * fix randomized tests * add documentation * dedup id in docs * update documentation * update documentation * fix bwc * fix bwc * fix unintended * Revert "fix bwc" This reverts commit 18dc913. * Revert "fix bwc" This reverts commit f4ddb0e. * add missing test * fix transform * fix transform * fix transform * fix transform * fix transform (cherry picked from commit dd20248) # Conflicts: # rest-api-spec/build.gradle
* Add object param for keeping synthetic source (#113690) * Add object param for keeping synthetic source * Update docs/changelog/113690.yaml * fix merging * add tests * merge * fix randomized tests * add documentation * dedup id in docs * update documentation * update documentation * fix bwc * fix bwc * fix unintended * Revert "fix bwc" This reverts commit 18dc913. * Revert "fix bwc" This reverts commit f4ddb0e. * add missing test * fix transform * fix transform * fix transform * fix transform * fix transform (cherry picked from commit dd20248) # Conflicts: # rest-api-spec/build.gradle * Update build.gradle * Update MapperFeatures.java * Update 20_synthetic_source.yml * Update 21_synthetic_source_stored.yml * Update 21_synthetic_source_stored.yml * Update 21_synthetic_source_stored.yml * Update 21_synthetic_source_stored.yml
* Add object param for keeping synthetic source * Update docs/changelog/113690.yaml * fix merging * add tests * merge * fix randomized tests * add documentation * dedup id in docs * update documentation * update documentation * fix bwc * fix bwc * fix unintended * Revert "fix bwc" This reverts commit 18dc913. * Revert "fix bwc" This reverts commit f4ddb0e. * add missing test * fix transform * fix transform * fix transform * fix transform * fix transform
Add per-object param for controlling source recording, in line with #112706 for leaf fields.
Document how to keep synthetic source per field, object or index.
Related to #112012