-
Notifications
You must be signed in to change notification settings - Fork 25.5k
Make flattened synthetic source concatenate object keys on scalar/object mismatch #129600
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make flattened synthetic source concatenate object keys on scalar/object mismatch #129600
Conversation
.../main/java/org/elasticsearch/index/mapper/flattened/FlattenedFieldSyntheticWriterHelper.java
Outdated
Show resolved
Hide resolved
.../main/java/org/elasticsearch/index/mapper/flattened/FlattenedFieldSyntheticWriterHelper.java
Outdated
Show resolved
Hide resolved
next = nextValue == null ? KeyValue.EMPTY : new KeyValue(nextValue); | ||
|
||
var startPrefix = curr.prefix.diff(openObjects); | ||
if (startPrefix.prefix.isEmpty() == false && startPrefix.prefix.getFirst().equals(lastScalarSingleLeaf)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if the conflict doesn't happen on the first part of the prefix, e.g.
field {
path {
to: 10
to {
foo: bar
}
}
}
Would this be caught here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, so this will become:
field {
path {
to: 10
to.foo: bar
}
}
When it get to the first key/value field.path.to|10
it will take the else block and traverse down into the object, adding field
and path
to the openObject
context. When it reaches the key value field.path.to.foo|bar
that object will still be open, and seeing that lastScalarSingleLeaf
has a value of to
, and that to
is the first token in the startPrexix (to.foo
), it will make a concatenated path.
(Updated a test to this situation to verify it)
.../main/java/org/elasticsearch/index/mapper/flattened/FlattenedFieldSyntheticWriterHelper.java
Outdated
Show resolved
Hide resolved
// THEN | ||
assertEquals( | ||
"{\"a\":\"value_a\",\"a\":{\"b\":\"value_b\",\"b\":{\"c\":\"value_c\"},\"d\":\"value_d\"}}", | ||
"{\"a\":\"value_a\",\"a.b\":\"value_b\",\"a.b.c\":\"value_c\",\"a.d\":\"value_d\"}", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interestingly, there was a test which had a scalar/object mismatch. But it produced duplicate keys. When the xcontent was converted to jsont these duplicate keys originally threw an error, but now just drop the duplicates. (Something must have changed in xcontent stuff since the issue was opened to cause this change from an error to deduplication)
Pinging @elastic/es-storage-engine (Team:StorageEngine) |
💔 Backport failed
You can use sqren/backport to manually backport by running |
…ect mismatch (elastic#129600) There is an issue where for Flattened fields with synthetic source, if there is a key with a scalar value, and a duplicate key with an object value, one of the values will be left out of the produced synthetic source. This fixes the issue by replacing the object with paths to each of its keys. These paths consist of the concatenation of all keys going down to a given scalar, joined by a period. For example, they are of the form foo.bar.baz. This applies recursively, so that every value within the object, no matter how nested, will be accessible through a full specified path.
…ect mismatch (elastic#129600) There is an issue where for Flattened fields with synthetic source, if there is a key with a scalar value, and a duplicate key with an object value, one of the values will be left out of the produced synthetic source. This fixes the issue by replacing the object with paths to each of its keys. These paths consist of the concatenation of all keys going down to a given scalar, joined by a period. For example, they are of the form foo.bar.baz. This applies recursively, so that every value within the object, no matter how nested, will be accessible through a full specified path. (cherry picked from commit 245dc07) # Conflicts: # docs/reference/elasticsearch/mapping-reference/flattened.md
💚 All backports created successfully
Questions ?Please refer to the Backport tool documentation |
…ect mismatch (elastic#129600) There is an issue where for Flattened fields with synthetic source, if there is a key with a scalar value, and a duplicate key with an object value, one of the values will be left out of the produced synthetic source. This fixes the issue by replacing the object with paths to each of its keys. These paths consist of the concatenation of all keys going down to a given scalar, joined by a period. For example, they are of the form foo.bar.baz. This applies recursively, so that every value within the object, no matter how nested, will be accessible through a full specified path. (cherry picked from commit 245dc07)
…ect mismatch (#129600) (#129792) There is an issue where for Flattened fields with synthetic source, if there is a key with a scalar value, and a duplicate key with an object value, one of the values will be left out of the produced synthetic source. This fixes the issue by replacing the object with paths to each of its keys. These paths consist of the concatenation of all keys going down to a given scalar, joined by a period. For example, they are of the form foo.bar.baz. This applies recursively, so that every value within the object, no matter how nested, will be accessible through a full specified path.
…lar/object mismatch (#129600) (#129794) * Make flattened synthetic source concatenate object keys on scalar/object mismatch (#129600) There is an issue where for Flattened fields with synthetic source, if there is a key with a scalar value, and a duplicate key with an object value, one of the values will be left out of the produced synthetic source. This fixes the issue by replacing the object with paths to each of its keys. These paths consist of the concatenation of all keys going down to a given scalar, joined by a period. For example, they are of the form foo.bar.baz. This applies recursively, so that every value within the object, no matter how nested, will be accessible through a full specified path. (cherry picked from commit 245dc07) * remove methods not avaiable in java version * skip testing console-result in docs
…lar/object mismatch (#129600) (#129793) * Make flattened synthetic source concatenate object keys on scalar/object mismatch (#129600) There is an issue where for Flattened fields with synthetic source, if there is a key with a scalar value, and a duplicate key with an object value, one of the values will be left out of the produced synthetic source. This fixes the issue by replacing the object with paths to each of its keys. These paths consist of the concatenation of all keys going down to a given scalar, joined by a period. For example, they are of the form foo.bar.baz. This applies recursively, so that every value within the object, no matter how nested, will be accessible through a full specified path. (cherry picked from commit 245dc07) # Conflicts: # docs/reference/elasticsearch/mapping-reference/flattened.md * remove methods not avaiable in java version * skip testing console-result in docs
…ect mismatch (elastic#129600) There is an issue where for Flattened fields with synthetic source, if there is a key with a scalar value, and a duplicate key with an object value, one of the values will be left out of the produced synthetic source. This fixes the issue by replacing the object with paths to each of its keys. These paths consist of the concatenation of all keys going down to a given scalar, joined by a period. For example, they are of the form foo.bar.baz. This applies recursively, so that every value within the object, no matter how nested, will be accessible through a full specified path.
…ect mismatch (elastic#129600) There is an issue where for Flattened fields with synthetic source, if there is a key with a scalar value, and a duplicate key with an object value, one of the values will be left out of the produced synthetic source. This fixes the issue by replacing the object with paths to each of its keys. These paths consist of the concatenation of all keys going down to a given scalar, joined by a period. For example, they are of the form foo.bar.baz. This applies recursively, so that every value within the object, no matter how nested, will be accessible through a full specified path.
…ect mismatch (elastic#129600) There is an issue where for Flattened fields with synthetic source, if there is a key with a scalar value, and a duplicate key with an object value, one of the values will be left out of the produced synthetic source. This fixes the issue by replacing the object with paths to each of its keys. These paths consist of the concatenation of all keys going down to a given scalar, joined by a period. For example, they are of the form foo.bar.baz. This applies recursively, so that every value within the object, no matter how nested, will be accessible through a full specified path.
@parkertimmins Did this fix get back ported to the 9.0 branch? EDIT: I think it did. However this PR has still the backport pending label. |
@martijnvg Yep, it backported to 8.18, 8.19, and 9.0. I'll go ahead and remove the |
A bug was found in v9.0.2, where attempting to get the synthetic source of a flattened field produced error messages like: The cause of the issue was that some object opening braces were missing from the synthetic source of the flattened field. So the json has more closing braces than opening braces, causing subsequent writes to go to the root context (hence the error message). The bug is here: it should have an else clause with a break statement. This code is used to decide how many braces to open. The current path is compared against the previous path to see what objects have already been opened. For example, assume the previous path is Here's some code with a test showing the behavior along with a fix: https://github.com/elastic/elasticsearch/compare/v9.0.2...parkertimmins:elasticsearch:parker/flattened-test-from-v9.0.2?expand=1 |
There was a bug in previous version where flattened fields would produce incorrect synthetic source with too few opening braces. This bug was fixed as a side effect of elastic#129600. Adding this test to confirm. See elastic#129600 for a full explanation.
There was a bug in previous version where flattened fields would produce incorrect synthetic source with too few opening braces. This bug was fixed as a side effect of elastic#129600. Adding this test to confirm. See elastic#129600 for a full explanation.
There is an issue where for Flattened fields with synthetic source, if there is a key with a scalar value, and a duplicate key with an object value, one of the values will be left out of the produced synthetic source.
This fixes the issue by replacing the problematic object with paths to each of its keys. These paths consist of the concatenation of all keys going down to a given scalar, joined by
.
. For example, they are of the formfoo.bar.baz
. This applies recursively, so that every value within the object, no matter how nested, will be accessible through a full specified path.For example if the following flattened field values is indexed:
The following synthetic source will be produced:
Fixes #122936