Skip to content

Commit 8aa1d27

Browse files
Liudmila Molkovatraskpellaredjmacdadriangb
authored
OTEP: Extending attributes to support complex values (#4485)
Related to #4468, #4414 This OTEP suggests a path forward on how to support complex attributes. It proposes to: - allow complex attributes on all signals by default on the API and SDK level - discourage them on metrics, resources, instrumentation scope, and identifying attributes on entities. If complex attribute is provided, it's probably by mistake, but we let backend decide. - provide guidance in semconv to default to flat attributes whenever possible and develop conventions with an assumption that complex attributes are not indexed, not queryable and not efficient. - documents how limits can apply to complex attributes (leaf nodes are counted towards the attribute count limit and are truncated based on the value limit) and suggests other ways to protect from users misusing complex attributes. Having experimental phase for complex attributes limits the impact of this change and gives backends a graceful period to update. --------- Co-authored-by: Trask Stalnaker <[email protected]> Co-authored-by: Robert Pająk <[email protected]> Co-authored-by: Joshua MacDonald <[email protected]> Co-authored-by: Adrian Garcia Badaracco <[email protected]> Co-authored-by: Alan West <[email protected]> Co-authored-by: jason plumb <[email protected]>
1 parent c041658 commit 8aa1d27

File tree

1 file changed

+311
-0
lines changed

1 file changed

+311
-0
lines changed
Lines changed: 311 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,311 @@
1+
# Extending attributes to support complex values
2+
3+
<!-- toc -->
4+
5+
- [Glossary](#glossary)
6+
- [Why?](#why)
7+
* [Why do we want complex attributes on spans?](#why-do-we-want-complex-attributes-on-spans)
8+
* [Why do we want to extend standard attributes?](#why-do-we-want-to-extend-standard-attributes)
9+
* [Why doesn't this require a major version bump?](#why-doesnt-this-require-a-major-version-bump)
10+
- [How](#how)
11+
* [API](#api)
12+
* [SDK](#sdk)
13+
+ [`AnyValue` implementation notes](#anyvalue-implementation-notes)
14+
+ [Attribute limits](#attribute-limits)
15+
* [Exporters](#exporters)
16+
* [Semantic conventions](#semantic-conventions)
17+
* [Proto](#proto)
18+
- [Trade-offs and mitigations](#trade-offs-and-mitigations)
19+
* [Backends don't support `AnyValue` attributes](#backends-dont-support-anyvalue-attributes)
20+
* [Arbitrary objects are dangerous](#arbitrary-objects-are-dangerous)
21+
- [Prototypes](#prototypes)
22+
- [Future possibilities](#future-possibilities)
23+
* [Configurable OTLP exporter behavior (both SDK and Collector)](#configurable-otlp-exporter-behavior-both-sdk-and-collector)
24+
* [Record pointer to repetitive data](#record-pointer-to-repetitive-data)
25+
- [Backend research](#backend-research)
26+
27+
<!-- tocstop -->
28+
29+
## Glossary
30+
31+
In the context of this OTEP, we use the following terminology:
32+
33+
- **Simple attributes** are attributes with primitive types or homogeneous arrays of primitives.
34+
Their types are known in advance and correspond to the top-level `string_value`,
35+
`bool`, `int64`, `double`, and `ArrayValue` of those types in the
36+
[AnyValue proto definition](https://github.com/open-telemetry/opentelemetry-proto/blob/42319f8b5bf330f7c3dd4a097384f9f6d5467450/opentelemetry/proto/common/v1/common.proto#L28-L40).
37+
These are currently referred to as *standard* attributes in the
38+
[specification](https://github.com/open-telemetry/opentelemetry-specification/blob/v1.44.0/specification/common/README.md)
39+
40+
- **Complex attributes** include all other values supported by the `AnyValue` proto,
41+
such as null (empty) value, maps, heterogeneous arrays, and combinations of those with primitives.
42+
Byte arrays are also considered complex attributes, as they are excluded from
43+
the current definition of *standard* attributes.
44+
45+
- **AnyValue** represents the type of *any* (simple or complex) attribute value on
46+
the API, SDK, and [proto level](https://github.com/open-telemetry/opentelemetry-proto/blob/42319f8b5bf330f7c3dd4a097384f9f6d5467450/opentelemetry/proto/common/v1/common.proto#L28-L40).
47+
48+
It's also known as `any` in the [Log data model](https://github.com/open-telemetry/opentelemetry-specification/blob/v1.44.0/specification/logs/data-model.md#type-any)
49+
50+
This distinction between simple and complex attributes is not intended for the
51+
spec language, but is helpful here because the OTEP proposes including both *simple*
52+
and *complex* attributes in the set of *standard* attributes.
53+
54+
## Why?
55+
56+
### Why do we want complex attributes on spans?
57+
58+
There are a number of reasons why we want to allow complex attributes on spans:
59+
60+
- Emerging semantic conventions have demonstrated the usefulness of
61+
including complex attributes on spans, such as
62+
[capturing prompts and completions](https://github.com/open-telemetry/semantic-conventions/pull/2179)
63+
for Generative AI
64+
and [capturing request errors](https://graphql.org/learn/response/#request-errors)
65+
for GraphQL.
66+
- Many users already add complex attributes to spans by JSON-encoding them.
67+
Supporting complex attributes natively enables future possibilities like
68+
collector transformations, better attribute truncation, and native
69+
storage of complex attributes in backends.
70+
- During the span event deprecation process, we found that some people
71+
use span events to represent complex data on spans.
72+
Providing support for complex attributes on spans would offer a more natural
73+
way to store this data.
74+
- During the event stabilization process, it became clear that
75+
[spans and events are often thought of as being conceptual
76+
twins](https://opentelemetry.io/blog/2025/opentelemetry-logging-and-you/#how-is-this-different-from-other-signals),
77+
and that the choice between modeling something as a span or an event should not
78+
be influenced by whether complex attributes are needed
79+
(given that logs already support complex attributes).
80+
81+
### Why do we want to extend standard attributes?
82+
83+
Instead of introducing a second set of "extended" attributes that can be used on
84+
spans and events, we propose to extend the standard attributes.
85+
86+
Having multiple attribute sets across the API
87+
[creates ergonomic challenges](https://github.com/open-telemetry/opentelemetry-specification/issues/4201).
88+
While there are some mitigations (as demonstrated in
89+
[opentelemetry-java#7123](https://github.com/open-telemetry/opentelemetry-java/pull/7123) and
90+
[opentelemetry-go#6180](https://github.com/open-telemetry/opentelemetry-go/pull/6180)),
91+
extending the standard attributes provides a more seamless and user-friendly API experience.
92+
93+
### Why doesn't this require a major version bump?
94+
95+
Currently, the SDK specification has a clause that says extending
96+
the set of standard attribute would be
97+
[considered a breaking change](/specification/common/README.md#standard-attribute).
98+
99+
We believe that removing this clause and extending standard
100+
attributes can be done gracefully across the OpenTelemetry ecosystem
101+
without requiring a major version bump:
102+
103+
- Language SDKs can implement this without breaking their backwards
104+
compatibility guarantees (e.g., [Java's](https://github.com/open-telemetry/opentelemetry-java/blob/main/VERSIONING.md)).
105+
- While backends may still need to add support for complex attributes,
106+
this is the case with the introduction of any new OTLP feature.
107+
- Bumping the OTLP minor version is already the normal communication mechanism
108+
for this kind of change.
109+
- SDKs will be required to only emit complex attributes under that OTLP version
110+
or later.
111+
- Stable exporters will be prohibited from emitting complex attributes by default on signals
112+
other than Logs until at least 6 months after this OTEP is merged.
113+
- A reasonably straightforward implementation option for backends is to just
114+
JSON serialize complex attributes and store them as strings.
115+
116+
## How
117+
118+
### API
119+
120+
Existing APIs that create or add attributes will be extended to support
121+
*complex attributes*.
122+
123+
It's RECOMMENDED to expose an `AnyValue` type - the API representing complex or
124+
simple attribute value for type checking, ergonomics, and performance reasons.
125+
126+
Exposing multiple types of attribute sets is NOT RECOMMENDED, such as having "ExtendedAttributes" in addition to "Attributes".
127+
128+
OTel API MUST support setting complex attributes on spans, logs, profiles,
129+
span links, and as descriptive entity attributes.
130+
131+
OTel API MAY support setting complex attributes on metrics, resources,
132+
instrumentation scope, span events, and as identifying entity attributes.
133+
134+
> [!NOTE]
135+
> "MAY" is used here instead of "MUST" to give flexibility to dynamically
136+
> typed language APIs since there are no concrete use cases at this time
137+
> requiring complex attributes in these areas.
138+
>
139+
> Most likely statically typed languages will choose to support
140+
> setting complex attributes uniformly everywhere.
141+
>
142+
> This requirement level could change from "MAY" to "MUST" in the future
143+
> if we uncover use cases for complex attributes in these areas.
144+
145+
API documentation and spec language around complex attributes SHOULD include
146+
language similar to this:
147+
148+
> Simple attributes SHOULD be used whenever possible. Instrumentations SHOULD
149+
> assume that backends do not index individual properties of complex attributes,
150+
> that querying or aggregating on such properties is inefficient and complicated,
151+
> and that reporting complex attributes carries higher performance overhead.
152+
153+
### SDK
154+
155+
OTel SDK MUST support setting complex attributes on spans, logs, profiles,
156+
span links, and as descriptive entity attributes.
157+
158+
OTel SDK MAY support setting complex attributes on metrics, exemplars, resources,
159+
instrumentation scope, span events, and as identifying entity attributes.
160+
161+
> [!NOTE]
162+
> "MAY" is used here instead of "MUST" to give flexibility to dynamically
163+
> typed language SDKs since there are no concrete use cases at this time
164+
> requiring complex attributes in these areas.
165+
>
166+
> Most likely statically typed languages will choose to support
167+
> setting complex attributes uniformly everywhere.
168+
>
169+
> This requirement level could change from "MAY" to "MUST" in the future
170+
> if we uncover use cases for complex attributes in these areas.
171+
172+
The SDK MUST support reading and modifying complex attributes during processing
173+
whenever they are allowed on the API surface.
174+
175+
#### `AnyValue` implementation notes
176+
177+
If the API supports `AnyValue` attributes on metrics, instrumentation scopes,
178+
resources, or identifying entity attributes where attribute equality is extensively
179+
used to identify tracers, meters, loggers, or time series, the `AnyValue`
180+
implementation MUST provide deep equality checks.
181+
182+
For `AnyValue` instances containing one or more lists of key-value pairs:
183+
184+
- the equality of `AnyValue` instances MUST NOT be affected by the ordering of
185+
the key-value pairs within the list.
186+
- the equality behavior is unspecified if duplicate keys are present.
187+
188+
#### Attribute limits
189+
190+
Complex attribute limits are to be defined separately from this OTEP and apply to all
191+
signals where complex attributes are allowed.
192+
193+
This is tracked in [#4487](https://github.com/open-telemetry/opentelemetry-specification/issues/4487)
194+
195+
### Exporters
196+
197+
OTLP exporter SHOULD pass `AnyValue` attributes to the endpoint.
198+
199+
Exporters for protocols that do not natively support complex values, such as Prometheus,
200+
SHOULD represent complex values as JSON-encoded strings following
201+
[attribute specification](/specification/common/README.md#attribute).
202+
203+
When serializing `AnyValue` objects to JSON, it is RECOMMENDED to sort lists
204+
of key-value pairs lexicographically by key and apply additional settings that
205+
enhance serialization stability.
206+
207+
### Semantic conventions
208+
209+
Semantic conventions will be updated with the following guidance:
210+
211+
- Simple attributes SHOULD be used whenever possible. Semantic conventions SHOULD
212+
assume that backends do not index individual properties of complex attributes,
213+
that querying or aggregating on such properties is inefficient and complicated,
214+
and that reporting complex attributes carries higher performance overhead.
215+
216+
- Complex attributes that are likely to be large (when serialized to a string) SHOULD
217+
be added as *opt-in* attributes. *Large* will be defined based on common backend
218+
attribute limits, typically around 4–16 KB.
219+
220+
- Complex attributes MUST NOT be used on metrics, resources, instrumentation scopes,
221+
or as identifying attributes on entities.
222+
223+
### Proto
224+
225+
OTLP uses `AnyValue` attributes on all signals, so the changes would be limited
226+
to updating comments like [this one](https://github.com/open-telemetry/opentelemetry-proto/blob/be5d58470429d0255ffdd49491f0815a3a63d6ef/opentelemetry/proto/trace/v1/trace.proto#L209-L213)
227+
and adding changelog record.
228+
229+
## Trade-offs and mitigations
230+
231+
### Backends don't support `AnyValue` attributes
232+
233+
While it should be possible to record complex data in telemetry, many backends do not
234+
support it, which can result in individual complex attributes being dropped.
235+
236+
We mitigate this through:
237+
238+
- Introducing new APIs in the experimental parts of the OTel API which will limit
239+
the impact of unsupported attribute types to early adopters, while giving
240+
backends time to add support.
241+
242+
- Semantic conventions guidance that limits usage of complex attributes.
243+
244+
- Existing collector transformation processor that can drop, flatten, serialize,
245+
or truncate complex attributes using OTTL.
246+
247+
### Arbitrary objects are dangerous
248+
249+
Allowing arbitrary objects as attributes is convenient but increases the risk of
250+
including large, sensitive, mutable, non-serializable, or otherwise problematic
251+
data in telemetry.
252+
253+
OTel SDKs that provide convenience to convert arbitrary objects to `AnyValue`
254+
SHOULD limit supported types to primitives, arrays, standard library collections,
255+
named tuples, JSON objects, and similar structures following
256+
[mapping to OTLP AnyValue](/specification/common/attribute-type-mapping.md#converting-to-anyvalue).
257+
258+
Falling back to a string representation of unknown objects is RECOMMENDED to
259+
minimize the risk of unintentional use of complex attributes.
260+
261+
Prior art on AnyValue conversion: [Go](https://github.com/open-telemetry/opentelemetry-go-contrib/blob/72cccd8065dcfd84b69f34d8cb6f349a547fedce/bridges/otelslog/convert.go#L20),
262+
[.NET](https://github.com/open-telemetry/opentelemetry-dotnet/blob/71abd4169b4b6c672343b37c32e3337bc227ed32/src/OpenTelemetry/Logs/ILogger/OpenTelemetryLogger.cs#L134),
263+
[Python](https://github.com/open-telemetry/opentelemetry-python/blob/00329e07fb01d7c3e43bb513fe9be3748745c52e/opentelemetry-api/src/opentelemetry/attributes/__init__.py#L121)
264+
265+
## Prototypes
266+
267+
- [opentelemetry-python#4587](https://github.com/open-telemetry/opentelemetry-python/pull/4587)
268+
- [opentelemetry-go#6809](https://github.com/open-telemetry/opentelemetry-go/pull/6180)
269+
270+
## Future possibilities
271+
272+
### Configurable OTLP exporter behavior (both SDK and Collector)
273+
274+
The OTLP exporter behavior for complex attributes can be made customizable on a per-signal
275+
basis, allowing complex attributes to be:
276+
277+
- passed through (the default),
278+
- serialized to JSON, or
279+
- dropped
280+
281+
This option may be useful as a workaround for applications
282+
whose backend does not handle complex attribute types gracefully.
283+
284+
### Record pointer to repetitive data
285+
286+
Aggregating over structured data introduces performance overhead and additional complexity, such as requiring deep equality checks.
287+
288+
If the data is repetitive, it can be recorded once and assigned a unique identifier. Subsequent telemetry items can then reference this identifier instead of duplicating the data.
289+
290+
This approach reduces performance overhead and the volume of transmitted data and can be implemented incrementally as an optimization. It should not affect the instrumentation API or, more importantly, the user experience, including queries and dashboards.
291+
292+
## Backend research
293+
294+
See [the gist](https://gist.github.com/lmolkova/737ebba190b206a5d60bbc075fea538b)
295+
for additional details.
296+
297+
| Backend | Handles complex attributes gracefully? | Comments |
298+
| --------------------------------- | ----- | ------------------------------ |
299+
| Jaeger (OTLP) | :white_check_mark: | serializes to JSON string |
300+
| Prometheus with OTLP remote write | :white_check_mark: | serializes to JSON string |
301+
| Grafana Tempo (OTLP) | :white_check_mark: | serializes to JSON string, viewable but can't query using this attribute |
302+
| Grafana Loki (OTLP) | :white_check_mark: | flattens |
303+
| Aspire dashboard (OTLP) | :white_check_mark: | serializes to JSON string |
304+
| ClickHouse (collector exporter) | :white_check_mark: | serializes to JSON string, can parse JSON and query |
305+
| Honeycomb (OTLP) | :white_check_mark: | flattens if less than 5 layers deep, not array or binary data, JSON string otherwise |
306+
| Logfire (OTLP) | :white_check_mark: | stored as JSON, native support for JSON in queries |
307+
| New Relic (OTLP) | :white_check_mark: | drops the complex attribute | |
308+
| Splunk (OTLP and HEC exporter) | :white_check_mark: | flattens for logs (HEC), serializes to JSON string for traces and metrics (OTLP) |
309+
310+
> [!NOTE]
311+
> This list only reflects the behavior at the time of writing and may change in the future.

0 commit comments

Comments
 (0)