-
Notifications
You must be signed in to change notification settings - Fork 1.8k
[Exporterhelper Exporter] Enable metadata population in exporterhelper when sending_queue is enabled #14139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[Exporterhelper Exporter] Enable metadata population in exporterhelper when sending_queue is enabled #14139
Changes from 23 commits
5261989
5b74827
5d3f323
32f7498
fbeef6b
00a18f3
e07c0aa
1178536
331a703
5b40717
a89aef2
f74f3a4
f4b629f
c290ef6
77be9c9
0836ef7
c7746e5
759ac81
d233ec4
bafdd11
e197735
5c4559b
8cb459d
8b766a3
513a1fd
5502a54
a341256
2fd73f9
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,28 @@ | ||
| # Use this changelog template to create an entry for release notes. | ||
|
|
||
| # One of 'breaking', 'deprecation', 'new_component', 'enhancement', 'bug_fix' | ||
| change_type: enhancement | ||
|
|
||
| # The name of the component, or a single word describing the area of concern, (e.g. otlpreceiver) | ||
| component: pkg/exporterhelper | ||
|
|
||
| # A brief description of the change. Surround your text with quotes ("") if it needs to start with a backtick (`). | ||
| note: "Add `metadata_keys` configuration to `sending_queue` to partition batches by client metadata" | ||
|
|
||
| # One or more tracking issues or pull requests related to the change | ||
| issues: [14139] | ||
|
|
||
| # (Optional) One or more lines of additional information to render under the primary note. | ||
| # These lines will be padded with 2 spaces and then inserted directly into the document. | ||
| # Use pipe (|) for multiline entries. | ||
| subtext: | | ||
| The `metadata_keys` configuration option is now available in the `sending_queue` section for all exporters | ||
| using queue batch functionality. When specified, batches are partitioned based on the values of the listed metadata keys, allowing separate batching per metadata partition. This feature | ||
| is automatically configured when using `exporterhelper.WithQueue()`. | ||
|
|
||
| # Optional: The change log or logs in which this entry should be included. | ||
| # e.g. '[user]' or '[user, api]' | ||
| # Include 'user' if the change is relevant to end users. | ||
| # Include 'api' if there is a change to a library API. | ||
| # Default: '[user]' | ||
| change_logs: [user, api] |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -6,6 +6,7 @@ package queuebatch // import "go.opentelemetry.io/collector/exporter/exporterhel | |
| import ( | ||
| "errors" | ||
| "fmt" | ||
| "strings" | ||
| "time" | ||
|
|
||
| "go.opentelemetry.io/collector/component" | ||
|
|
@@ -46,6 +47,16 @@ type Config struct { | |
|
|
||
| // BatchConfig it configures how the requests are consumed from the queue and batch together during consumption. | ||
| Batch configoptional.Optional[BatchConfig] `mapstructure:"batch"` | ||
|
|
||
| // MetadataKeys is a list of client.Metadata keys that will be used to partition | ||
| // the data into batches. If this setting is empty, a single batcher instance | ||
| // will be used. When this setting is not empty, one batcher will be used per | ||
| // distinct combination of values for the listed metadata keys. | ||
| // | ||
| // Empty value and unset metadata are treated as distinct cases. | ||
| // | ||
| // Entries are case-insensitive. Duplicated entries will trigger a validation error. | ||
| MetadataKeys []string `mapstructure:"metadata_keys"` | ||
axw marked this conversation as resolved.
Show resolved
Hide resolved
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We need to provide an option to split batches using keys from other sources, not only request-context metadata, but also resource attributes, instrumentation name etc. #12795 has more details. This interface would conflict with that functionality. I’m not sure it’s worth having two configuration options that overlap or conflict with each other. I’d prefer to make progress toward #12795 instead. @axw WDYT?
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks for the reminder @dmitryax, good points. I'd also love to make some progress on that - I'll have try to have a play with some ideas today.
Why would it conflict? Could we not have both? i.e. config for partitioning by metadata, as well as say support for partitioner extensions? It should also be possible to compose them, as I suggested in another comment thread, e.g. partition first by metadata, and then partition those batches via an extension. (Playing devil's advocate a little bit, not necessarily saying that's what we should do.)
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I spent a while looking into this today, and I'm struggling to see a good path for adding resource/scope/record-level partitioning to exporterhelper. I previously had a bit of a warped understanding of the Partitioner interface. I had in mind that it was operating on I started looking into adding partitioning before we even get to batching, but stopped pretty quickly because that's effectively equivalent to processor. So... I'm back to thinking we should:
@dmitryax if you have ideas about how we can achieve it all in exporterhelper I'd be keen to hear them. |
||
| } | ||
|
|
||
| func (cfg *Config) Unmarshal(conf *confmap.Conf) error { | ||
|
|
@@ -90,6 +101,16 @@ func (cfg *Config) Validate() error { | |
| } | ||
| } | ||
|
|
||
| // Validate metadata_keys for duplicates (case-insensitive) | ||
| uniq := map[string]bool{} | ||
| for _, k := range cfg.MetadataKeys { | ||
| l := strings.ToLower(k) | ||
| if _, has := uniq[l]; has { | ||
| return fmt.Errorf("duplicate entry in metadata_keys: %q (case-insensitive)", l) | ||
| } | ||
| uniq[l] = true | ||
| } | ||
|
|
||
| return nil | ||
| } | ||
|
|
||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,88 @@ | ||
| // Copyright The OpenTelemetry Authors | ||
| // SPDX-License-Identifier: Apache-2.0 | ||
|
|
||
| package queuebatch // import "go.opentelemetry.io/collector/exporter/exporterhelper/internal/queuebatch" | ||
|
|
||
| import ( | ||
| "bytes" | ||
| "context" | ||
| "fmt" | ||
| "slices" | ||
|
|
||
| "go.opentelemetry.io/collector/client" | ||
| "go.opentelemetry.io/collector/exporter/exporterhelper/internal/request" | ||
| ) | ||
|
|
||
| // metadataKeysPartitioner partitions requests based on client metadata keys. | ||
| type metadataKeysPartitioner struct { | ||
| keys []string | ||
| } | ||
|
|
||
| // NewMetadataKeysPartitioner creates a new partitioner that partitions requests | ||
| // based on the specified metadata keys. If keys is empty, returns nil. | ||
| func NewMetadataKeysPartitioner(keys []string) Partitioner[request.Request] { | ||
| if len(keys) == 0 { | ||
| return nil | ||
| } | ||
| return &metadataKeysPartitioner{keys: keys} | ||
| } | ||
|
|
||
| // GetKey returns a partition key based on the metadata keys configured. | ||
| func (p *metadataKeysPartitioner) GetKey( | ||
| ctx context.Context, | ||
| _ request.Request, | ||
| ) string { | ||
| var kb bytes.Buffer | ||
| meta := client.FromContext(ctx).Metadata | ||
|
|
||
| var afterFirst bool | ||
| for _, k := range p.keys { | ||
| if values := meta.Get(k); len(values) != 0 { | ||
| if afterFirst { | ||
| kb.WriteByte(0) | ||
| } | ||
| kb.WriteString(k) | ||
| afterFirst = true | ||
| for _, val := range values { | ||
| kb.WriteByte(0) | ||
| kb.WriteString(val) | ||
| } | ||
| } | ||
| } | ||
| return kb.String() | ||
| } | ||
|
|
||
| // NewMetadataKeysMergeCtx creates a merge function for contexts that merges | ||
| // metadata based on the specified keys. If keys is empty, returns nil. | ||
| func NewMetadataKeysMergeCtx(keys []string) func(context.Context, context.Context) context.Context { | ||
| if len(keys) == 0 { | ||
| return nil | ||
| } | ||
| return func(ctx1, ctx2 context.Context) context.Context { | ||
| m1 := client.FromContext(ctx1).Metadata | ||
| m2 := client.FromContext(ctx2).Metadata | ||
|
|
||
| m := make(map[string][]string, len(keys)) | ||
| for _, key := range keys { | ||
| v1 := m1.Get(key) | ||
| v2 := m2.Get(key) | ||
| if len(v1) == 0 && len(v2) == 0 { | ||
| continue | ||
| } | ||
|
|
||
| // Since the mergeCtx is based on partition key, we MUST have the same | ||
| // partition key-values in both the metadata. If they are not same then | ||
| // fail fast and dramatically. | ||
| if !slices.Equal(v1, v2) { | ||
| panic(fmt.Errorf( | ||
| "unexpected client metadata found when merging context for key %s", key, | ||
| )) | ||
| } | ||
| m[key] = v1 | ||
| } | ||
| return client.NewContext( | ||
| context.Background(), | ||
| client.Info{Metadata: client.NewMetadata(m)}, | ||
| ) | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, I think we should probably compose them. But we can always do that in a followup - it would be an enhancement over what you have implemented.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it was quicker just to error for now.
Do you want me to create a follow up once this merged?
Some details that we consider are things like:
Which logic to merge first, probably the custom logic?
How to check metadata within each custom partition with overall metadata_keys if those exist?