-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Adding NormalizeForStreamProcessor
#125699
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Hi @eyalkoren, I've created a changelog YAML for you. |
|
Pinging @elastic/es-data-management (Team:Data Management) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Eyal, I left some initial comments. I had a question about the way that we nest a document with an existing attributes field into the OTel attributes. Is this something we want to do? For example this doc:
{
"attributes": {
"a": "b",
"c": [1, 2, 3]
},
"log.level": "1234"
}Becomes, after processing:
{
"resource": {
"attributes": {}
},
"severity_text": "1234",
"attributes": {
"attributes.a": "b",
"attributes.c": [1, 2, 3]
}
}Is that the desired behavior for an existing attributes field?
modules/ingest-ecs/src/main/java/org/elasticsearch/ingest/ecs/EcsNamespacingProcessor.java
Outdated
Show resolved
Hide resolved
modules/ingest-ecs/src/main/java/org/elasticsearch/ingest/ecs/EcsNamespacingProcessor.java
Outdated
Show resolved
Hide resolved
modules/ingest-ecs/src/main/java/org/elasticsearch/ingest/ecs/EcsNamespacingProcessor.java
Outdated
Show resolved
Hide resolved
modules/ingest-ecs/src/main/java/org/elasticsearch/ingest/ecs/EcsNamespacingProcessor.java
Outdated
Show resolved
Hide resolved
modules/ingest-ecs/src/main/java/org/elasticsearch/ingest/ecs/EcsNamespacingProcessor.java
Outdated
Show resolved
Hide resolved
modules/ingest-ecs/src/main/java/org/elasticsearch/ingest/ecs/EcsNamespacingProcessor.java
Outdated
Show resolved
Hide resolved
modules/ingest-ecs/src/main/java/org/elasticsearch/ingest/ecs/EcsNamespacingProcessor.java
Outdated
Show resolved
Hide resolved
modules/ingest-ecs/src/main/java/org/elasticsearch/ingest/ecs/EcsNamespacingProcessor.java
Outdated
Show resolved
Hide resolved
modules/ingest-ecs/src/main/java/org/elasticsearch/ingest/ecs/EcsNamespacingProcessor.java
Outdated
Show resolved
Hide resolved
modules/ingest-ecs/src/test/java/org/elasticsearch/ingest/ecs/EcsNamespacingProcessorTests.java
Outdated
Show resolved
Hide resolved
|
Answering the general question:
We started off by trying to be more "clever" about this and merge OTel with non-OTel. Then we had to handle lots of corner cases, like:
And so forth. So last week we decided to change the way we think about it: a document is either sent by an OTel-compliant shipper, or not. If not, no reason to treat the original fields as if they have the OTel sematics. So even if it has a field that has an OTel name, we can consider it to be by chance and namespace it. If that's so- no reason to complicate things for the unlikely event where non-OTel documents contain fields with intended OTel semantics. |
Co-authored-by: Lee Hinman <[email protected]>
Co-authored-by: Lee Hinman <[email protected]>
Co-authored-by: Lee Hinman <[email protected]>
…to ECS-namespacing-processor
EcsNamespaceProcessorNormalizeToStreamProcessor
NormalizeToStreamProcessorNormalizeForStreamProcessor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I opened a ticket for us to track adding the periodic ci to run the otel-semver-crawler test bits, but I'm fine with that staying disabled for now on this PR.
I see also that there's a perhaps unfinished conversation about the wording of the documentation. I'm okay if that's either resolved here and now, or if we fuss with the wording of things in a follow up PR.
✅
|
I believe it's the case that this PR is intended to be backported to 8.19.0, so I'm going to add that label, and also add the label for attempting to automatically backport it. Feel free to remove those labels if this is not actually intended for 8.19.0, though. |
|
This should go to 8.19 |
|
CI keeps being unhappy with random stuff, I don't think related to this PR contents. Regarding the followup task of running the test nightly - do you want me to open a GH issue with summary of what I already discussed with the delivery team? |
For my part, yes. 👍 (And quick, it's green!) |
I'll reach out to you offline. |
…or` (#134524) Fixes field querying and writing logic for NormalizeForStreamProcessor so that it can function on both `classic` and `flexible` ingest pipeline access patterns. NormalizeForStreamProcessor was added in #125699 with support for the default ingest node field access logic (now known as `classic` mode). We have since added support for the `flexible` access pattern in ingest pipelines, which allows for querying dotted field names and writing dotted field names when parent path elements are missing. The NormalizeForStreamProcessor was written with the classic access pattern in mind. The processor was designed to look for singular field names and to rely on the classic field writing logic which creates intermediate parent objects when setting a value that is nested in the document. When flexible mode was enabled, the logic did not anticipate dotted field names that could be inconsistently accessible from the source map at certain points in the path notation. Further, the flexible access pattern does not create intermediate parent objects like before. A secondary renaming method was added to take these changes into account.
…or` (elastic#134524) Fixes field querying and writing logic for NormalizeForStreamProcessor so that it can function on both `classic` and `flexible` ingest pipeline access patterns. NormalizeForStreamProcessor was added in elastic#125699 with support for the default ingest node field access logic (now known as `classic` mode). We have since added support for the `flexible` access pattern in ingest pipelines, which allows for querying dotted field names and writing dotted field names when parent path elements are missing. The NormalizeForStreamProcessor was written with the classic access pattern in mind. The processor was designed to look for singular field names and to rely on the classic field writing logic which creates intermediate parent objects when setting a value that is nested in the document. When flexible mode was enabled, the logic did not anticipate dotted field names that could be inconsistently accessible from the source map at certain points in the path notation. Further, the flexible access pattern does not create intermediate parent objects like before. A secondary renaming method was added to take these changes into account.
…or` (elastic#134524) Fixes field querying and writing logic for NormalizeForStreamProcessor so that it can function on both `classic` and `flexible` ingest pipeline access patterns. NormalizeForStreamProcessor was added in elastic#125699 with support for the default ingest node field access logic (now known as `classic` mode). We have since added support for the `flexible` access pattern in ingest pipelines, which allows for querying dotted field names and writing dotted field names when parent path elements are missing. The NormalizeForStreamProcessor was written with the classic access pattern in mind. The processor was designed to look for singular field names and to rely on the classic field writing logic which creates intermediate parent objects when setting a value that is nested in the document. When flexible mode was enabled, the logic did not anticipate dotted field names that could be inconsistently accessible from the source map at certain points in the path notation. Further, the flexible access pattern does not create intermediate parent objects like before. A secondary renaming method was added to take these changes into account.
Namespacing algorithm [EDIT 1]:
resourceexists as a key and the value is a mapresourceeither doesn't contain anattributesfield, or contains anattributesfield of type mapscopeis either missing or a mapattributesis either missing or a mapbodyis either missing or a mapbodyeither doesn't contain atextfield, or contains atextfield of typeStringbodyeither doesn't contain astructuredfield, or contains astructuredfield that is not of typeStringattributesmapresourcemap with one entry of whichattributesis the key and a new map as its valueattributesmap:attributes,resource,span_id,body,severity_textandtrace_idattributesandresourcemaps as top level fieldsspan.id,log.level) to OTel-compliant names: for each, look for a value first in the nested form and if not found look for a top level dotted field. The first value that is found is used for the renamed field@timestamp,trace_id,span_id,severity_text,body,attributes,resourceandscopeto the newattributesmapattributesattributestoresource.attributes