diff --git a/oteps/4485-extending-attributes-to-support-complex-values.md b/oteps/4485-extending-attributes-to-support-complex-values.md index 737c03732d4..95f1a0c6423 100644 --- a/oteps/4485-extending-attributes-to-support-complex-values.md +++ b/oteps/4485-extending-attributes-to-support-complex-values.md @@ -94,7 +94,7 @@ extending the standard attributes provides a more seamless and user-friendly API Currently, the SDK specification has a clause that says extending the set of standard attribute would be -[considered a breaking change](/specification/common/README.md#standard-attribute). +[considered a breaking change](https://github.com/open-telemetry/opentelemetry-specification/blob/v1.44.0/specification/common/README.md#standard-attribute). We believe that removing this clause and extending standard attributes can be done gracefully across the OpenTelemetry ecosystem diff --git a/oteps/entities/0256-entities-data-model.md b/oteps/entities/0256-entities-data-model.md index 51d5a00faae..70d0fde0b5f 100644 --- a/oteps/entities/0256-entities-data-model.md +++ b/oteps/entities/0256-entities-data-model.md @@ -139,7 +139,7 @@ MAY change over the lifetime of the entity. MAY be empty. These attributes are not part of entity's identity.

Follows any +href="https://github.com/open-telemetry/opentelemetry-specification/blob/v1.44.0/specification/logs/data-model.md#type-any">any value definition in the OpenTelemetry spec - it can be a scalar value, byte array, an array or map of values. Arbitrary deep nesting of values for arrays and maps is allowed. @@ -682,7 +682,7 @@ There are a couple of reasons: ### Attribute Data Type The data model requires the Attributes field to use the extended -[any](../../specification/logs/data-model.md#type-any) +[any](https://github.com/open-telemetry/opentelemetry-specification/blob/v1.44.0/specification/logs/data-model.md#type-any) attribute values, that allows more complex data types. This is different from the data type used by the Id field, which is more restricted in the shape. diff --git a/specification/common/README.md b/specification/common/README.md index 82f634d5dc9..c8b68026e37 100644 --- a/specification/common/README.md +++ b/specification/common/README.md @@ -15,43 +15,46 @@ path_base_for_github_subdir: +- [AnyValue](#anyvalue) +- [map](#mapstring-anyvalue) - [Attribute](#attribute) - * [Standard Attribute](#standard-attribute) - * [Attribute Limits](#attribute-limits) - + [Configurable Parameters](#configurable-parameters) - + [Exempt Entities](#exempt-entities) -- [Attribute Collections](#attribute-collections) + * [Attribute Collections](#attribute-collections) +- [Attribute Limits](#attribute-limits) + * [Configurable Parameters](#configurable-parameters) + * [Exempt Entities](#exempt-entities) -## Attribute - - +## AnyValue -An `Attribute` is a key-value pair, which MUST have the following properties: +`AnyValue` is either: -- The attribute key MUST be a non-`null` and non-empty string. - - Case sensitivity of keys is preserved. Keys that differ in casing are treated as distinct keys. -- The attribute value is either: - - A primitive type: string, boolean, double precision floating point (IEEE 754-1985) or signed 64 bit integer. - - An array of primitive type values. The array MUST be homogeneous, - i.e., it MUST NOT contain values of different types. +- a primitive type: string, boolean, double precision floating point + (IEEE 754-1985), or signed 64 bit integer, +- a homogeneous array of primitive type values. A homogeneous array MUST NOT + contain values of different types. +- a byte array. +- a heterogeneous array of `AnyValue`, +- a [`map`](#mapstring-anyvalue), +- an empty value (e.g. `null`, `undefined` in JavaScript/TypeScript, + `None` in Python, `nil` in Go/Ruby, etc.). -For protocols that do not natively support non-string values, non-string values SHOULD be represented as JSON-encoded strings. For example, the expression `int64(100)` will be encoded as `100`, `float64(1.5)` will be encoded as `1.5`, and an empty array of any type will be encoded as `[]`. +For protocols that do not natively support non-string values, non-string values +SHOULD be represented as JSON-encoded strings. For example, the expression +`int64(100)` will be encoded as `100`, `float64(1.5)` will be encoded as `1.5`, +and an empty array of any type will be encoded as `[]`. -Attribute values expressing a numerical value of zero, an empty string, or an -empty array are considered meaningful and MUST be stored and passed on to +AnyValues expressing an empty value, a numerical value of zero, an empty string, +or an empty array are considered meaningful and MUST be stored and passed on to processors / exporters. -Attribute values of `null` are not valid and attempting to set a `null` value is -undefined behavior. - -`null` values SHOULD NOT be allowed in arrays. However, if it is impossible to -make sure that no `null` values are accepted +While `null` is a valid attribute value, its use within homogeneous arrays +SHOULD generally be avoided unless language constraints make this impossible. +However, if it is impossible to make sure that no `null` values are accepted (e.g. in languages that do not have appropriate compile-time type checking), -`null` values within arrays MUST be preserved as-is (i.e., passed on to span +`null` values within homogeneous arrays MUST be preserved as-is (i.e., passed on to processors / exporters as `null`). If exporters do not support exporting `null` values, they MAY replace those values by 0, `false`, or empty strings. This is required for map/dictionary structures represented as two arrays with @@ -59,6 +62,34 @@ indices that are kept in sync (e.g., two attributes `header_keys` and `header_va both containing an array of strings to represent a mapping `header_keys[i] -> header_values[i]`). +## map + +`map` is a map of string keys to `AnyValue` values. +The keys in the map are unique (duplicate keys are not allowed). + +Arbitrary deep nesting of values for arrays and maps is allowed (essentially +allows to represent an equivalent of a JSON object). + +The representation of the map is language-dependent. + +The implementation MUST by default ensure that the exported maps contain only unique keys. + +The implementation MAY have an option to allow exporting maps with duplicate keys +(e.g. for better performance). +If such option is provided, it MUST be documented that for many receivers, +handling of maps with duplicate keys is unpredictable and it is the users' +responsibility to ensure keys are not duplicate. + +## Attribute + + + +An `Attribute` is a key-value pair, which MUST have the following properties: + +- The attribute key MUST be a non-`null` and non-empty string. + - Case sensitivity of keys is preserved. Keys that differ in casing are treated as distinct keys. +- The attribute value MUST be one of types defined in [AnyValue](#anyvalue). + Attributes are equal when their keys and values are equal. See [Attribute Naming](https://github.com/open-telemetry/semantic-conventions/blob/main/docs/general/naming.md#attributes) for naming guidelines. @@ -68,22 +99,52 @@ See [Requirement Level](https://github.com/open-telemetry/semantic-conventions/b See [this document](attribute-type-mapping.md) to find out how to map values obtained outside OpenTelemetry into OpenTelemetry attribute values. -### Standard Attribute +### Attribute Collections -Attributes are used in various places throughout the OpenTelemetry data model. -We designate the [previous attribute section](#attribute) as the standard -attribute definition, in order to facilitate more intuitive and consistent API / -SDK design. +[Resources](../resource/sdk.md), +[Instrumentation Scopes](instrumentation-scope.md), +[Metric points](../metrics/data-model.md#metric-points), +[Spans](../trace/api.md#set-attributes), Span +[Events](../trace/api.md#add-events), Span +[Links](../trace/api.md#link) and +[Log Records](../logs/data-model.md), +contain a collection of attributes. -The standard attribute definition SHOULD be used to represent attributes in data -modeling unless there is a strong justification to diverge. For example, the Log -Data Model has an extended [attributes](../logs/data-model.md#field-attributes) -definition allowing values of [type `Any`](../logs/data-model.md#type-any). This -reflects that LogRecord attributes are expected to model data produced from -external log APIs, which do not necessarily have the same value type -restrictions as the standard attribute definition. +Implementation MUST by default ensure that the exported attribute collections +contain only unique keys. The enforcement of uniqueness may be performed +in a variety of ways as it best fits the limitations of the particular +implementation. + +Normally for the telemetry generated using OpenTelemetry SDKs the attribute +key-value pairs are set via an API that either accepts a single key-value pair +or a collection of key-value pairs. Setting an attribute with the same key as an +existing attribute SHOULD overwrite the existing attribute's value. See for +example Span's [SetAttribute](../trace/api.md#set-attributes) API. -### Attribute Limits +A typical implementation of [SetAttribute](../trace/api.md#set-attributes) API +will enforce the uniqueness by overwriting any existing attribute values pending +to be exported, so that when the Span is eventually exported the exporters see +only unique attributes. The OTLP format in particular requires that exported +Resources, Spans, Metric data points and Log Records contain only unique +attributes. + +Some other implementations may use a streaming approach where every +[SetAttribute](../trace/api.md#set-attributes) API call immediately results in +that individual attribute value being exported using a streaming wire protocol. +In such cases the enforcement of uniqueness will likely be the responsibility of +the recipient of this data. + +Implementations MAY have an option to allow exporting attribute collections +with duplicate keys (e.g. for better performance). +If such option is provided, it MUST be documented that for many receivers, +handling of maps with duplicate keys is unpredictable and it is the users' +responsibility to ensure keys are not duplicate. + +Collection of attributes are equal when they contain the same attributes, +irrespective of the order in which those elements appear +(unordered collection equality). + +## Attribute Limits Execution of erroneous code can result in unintended attributes. If there are no limits placed on attributes, they can quickly exhaust available memory, resulting @@ -99,12 +160,21 @@ If an SDK provides a way to: - if it is a string, if it exceeds that limit (counting any character in it as 1), SDKs MUST truncate that value, so that its length is at most equal to the limit, - - if it is an array of strings, then apply the above rule to each of the - values separately, + - if it is a byte array, if it exceeds that limit (counting each byte as 1), + SDKs MUST truncate that value, so that its length is at most equal to the limit, + - if it is an array of string, then apply the limit to + each value within the array separately, + - if it is an array of [AnyValue](#anyvalue), then apply the limit to + each value within the array separately, + - if it is a [map](#mapstring-anyvalue), then apply the + limit to each value within the map separately, - otherwise a value MUST NOT be truncated; -- set a limit of unique attribute keys such that: - - for each unique attribute key, addition of which would result in exceeding - the limit, SDK MUST discard that key/value pair. +- set an attribute count limit such that: + - if an attribute addition into an attribute collection would result + in exceeding the limit (counting each attribute in the collection as 1), + SDK MUST discard that attribute, so that the total number of attributes in + an attribute collection is at most equal to the limit; + - otherwise an attribute MUST NOT be discarded. There MAY be a log emitted to indicate to the user that an attribute was truncated or discarded. To prevent excessive logging, the log MUST NOT be @@ -121,12 +191,12 @@ use the model-specific limit, if it isn't set, then the SDK MUST attempt to use the general limit. If neither are defined, then the SDK MUST try to use the model-specific limit default value, followed by the global limit default value. -#### Configurable Parameters +### Configurable Parameters * `AttributeCountLimit` (Default=128) - Maximum allowed attribute count per record; -* `AttributeValueLengthLimit` (Default=Infinity) - Maximum allowed attribute value length; +* `AttributeValueLengthLimit` (Default=Infinity) - Maximum allowed attribute value length (applies to string values and byte arrays); -#### Exempt Entities +### Exempt Entities Resource attributes SHOULD be exempt from the limits described above as resources are not susceptible to the scenarios (auto-instrumentation) that result in @@ -139,40 +209,3 @@ attribute limits for Resources. Attributes, which belong to Metrics, are exempt from the limits described above at this time, as discussed in [Metrics Attribute Limits](../metrics/sdk.md#attribute-limits). - -## Attribute Collections - -[Resources](../resource/sdk.md), -[Instrumentation Scopes](instrumentation-scope.md), -[Metric points](../metrics/data-model.md#metric-points), -[Spans](../trace/api.md#set-attributes), Span -[Events](../trace/api.md#add-events), Span -[Links](../trace/api.md#link) and -[Log Records](../logs/data-model.md) may contain a collection of attributes. The -keys in each such collection are unique, i.e. there MUST NOT exist more than one -key-value pair with the same key. The enforcement of uniqueness may be performed -in a variety of ways as it best fits the limitations of the particular -implementation. - -Normally for the telemetry generated using OpenTelemetry SDKs the attribute -key-value pairs are set via an API that either accepts a single key-value pair -or a collection of key-value pairs. Setting an attribute with the same key as an -existing attribute SHOULD overwrite the existing attribute's value. See for -example Span's [SetAttribute](../trace/api.md#set-attributes) API. - -A typical implementation of [SetAttribute](../trace/api.md#set-attributes) API -will enforce the uniqueness by overwriting any existing attribute values pending -to be exported, so that when the Span is eventually exported the exporters see -only unique attributes. The OTLP format in particular requires that exported -Resources, Spans, Metric data points and Log Records contain only unique -attributes. - -Some other implementations may use a streaming approach where every -[SetAttribute](../trace/api.md#set-attributes) API call immediately results in -that individual attribute value being exported using a streaming wire protocol. -In such cases the enforcement of uniqueness will likely be the responsibility of -the recipient of this data. - -Collection of attributes are equal when they contain the same attributes, -irrespective of the order in which those elements appear -(unordered collection equality). diff --git a/specification/entities/data-model.md b/specification/entities/data-model.md index 22c277edd82..4c7c088657c 100644 --- a/specification/entities/data-model.md +++ b/specification/entities/data-model.md @@ -48,8 +48,8 @@ physical format and encoding of how entity data is recorded). | Field | Type | Description | |--------------|----------------------------------------|-----------------| | Type | string | Defines the type of the entity. MUST not change during the lifetime of the entity. For example: "service" or "host". This field is required and MUST not be empty for valid entities. | -| Id | map | Attributes that identify the entity.

MUST not change during the lifetime of the entity. The Id must contain at least one attribute.

Follows OpenTelemetry [Standard attribute definition](../common/README.md#standard-attribute). SHOULD follow OpenTelemetry [semantic conventions](https://github.com/open-telemetry/semantic-conventions) for attributes. | -| Description | map | Descriptive (non-identifying) attributes of the entity.

MAY change over the lifetime of the entity. MAY be empty. These attributes are not part of entity's identity.

Follows [any](../logs/data-model.md#type-any) value definition in the OpenTelemetry spec. Arbitrary deep nesting of values for arrays and maps is allowed.

SHOULD follow OpenTelemetry [semantic conventions](https://github.com/open-telemetry/semantic-conventions/blob/main/docs/README.md) for attributes. | +| Id | map | Attributes that identify the entity.

MUST not change during the lifetime of the entity. The Id must contain at least one attribute.

Follows OpenTelemetry [attribute definition](../common/README.md#attribute). SHOULD follow OpenTelemetry [semantic conventions](https://github.com/open-telemetry/semantic-conventions) for attributes. | +| Description | map | Descriptive (non-identifying) attributes of the entity.

MAY change over the lifetime of the entity. MAY be empty. These attributes are not part of entity's identity.

Follows OpenTelemetry [attribute definition](../common/README.md#attribute). SHOULD follow OpenTelemetry [semantic conventions](https://github.com/open-telemetry/semantic-conventions/blob/main/docs/README.md) for attributes. | ## Minimally Sufficient Identity diff --git a/specification/logs/api.md b/specification/logs/api.md index 977770be6fb..a0f11ce49fc 100644 --- a/specification/logs/api.md +++ b/specification/logs/api.md @@ -124,14 +124,6 @@ The API MUST accept the following parameters: - [Attributes](./data-model.md#field-attributes) (optional) - [Event Name](./data-model.md#field-eventname) (optional) -**Status**: [Development](../document-status.md) - -The API SHOULD provide functionality for users to convert -[Standard Attributes](../common/README.md#standard-attribute) -so they can be used, or directly accept them, in the log signal. -This allows the reuse of [Standard Attributes](../common/README.md#standard-attribute) -across signals. - ### Enabled To help users avoid performing computationally expensive operations when diff --git a/specification/logs/data-model.md b/specification/logs/data-model.md index 807e5899aa1..260d77187ea 100644 --- a/specification/logs/data-model.md +++ b/specification/logs/data-model.md @@ -15,10 +15,6 @@ weight: 2 - [Design Notes](#design-notes) * [Requirements](#requirements) * [Events](#events) - * [Definitions Used in this Document](#definitions-used-in-this-document) - + [Type `any`](#type-any) - + [Type `map`](#type-mapstring-any) - * [Field Kinds](#field-kinds) - [Log and Event Record Definition](#log-and-event-record-definition) * [Field: `Timestamp`](#field-timestamp) * [Field: `ObservedTimestamp`](#field-observedtimestamp) @@ -112,85 +108,6 @@ conventions defined for logs SHOULD be formatted as Events. Requirements and det Events are intended to be used by OpenTelemetry instrumentation. It is not a requirement that all LogRecords are formatted as Events. -### Definitions Used in this Document - -In this document we refer to types `any` and `map`, defined as -follows. - -#### Type `any` - -Value of type `any` can be one of the following: - -- A scalar value: string, boolean, signed 64 bit integer, or double precision floating point (IEEE 754-1985) - -- A byte array, - -- An array (a list) of `any` values, - -- A `map`, - -- [since 1.31.0] An empty value (e.g. `null`). - -#### Type `map` - -Value of type `map` is a map of string keys to `any` values. The -keys in the map are unique (duplicate keys are not allowed). - -Arbitrary deep nesting of values for arrays and maps is allowed (essentially -allows to represent an equivalent of a JSON object). - -The representation of the map is language-dependent. - -The implementation MUST by default ensure that the exported maps contain only unique keys. - -The implementation MAY have an option to allow exporting maps with duplicate keys -(e.g. for better performance). -If such option is provided, it MUST be documented that for many receivers, -handling of maps with duplicate keys is unpredictable and it is the users' -responsibility to ensure keys are not duplicate. - -### Field Kinds - -This Data Model defines a logical model for a log record (irrespective of the -physical format and encoding of the record). Each record contains 2 kinds of -fields: - -- Named top-level fields of specific type and meaning. - -- Fields stored as `map`, which can contain arbitrary values of - different types. The keys and values for well-known fields follow semantic - conventions for key names and possible values that allow all parties that work - with the field to have the same interpretation of the data. See references to - semantic conventions for `Resource` and `Attributes` fields and examples in - [Appendix A](./data-model-appendix.md#appendix-a-example-mappings). - -The reasons for having these 2 kinds of fields are: - -- Ability to efficiently represent named top-level fields, which are almost - always present (e.g. when using encodings like Protocol Buffers where fields - are enumerated but not named on the wire). - -- Ability to enforce types of named fields, which is very useful for compiled - languages with type checks. - -- Flexibility to represent less frequent data as `map`. This - includes well-known data that has standardized semantics as well as arbitrary - custom data that the application may want to include in the logs. - -When designing this data model we followed the following reasoning to make a -decision about when to use a top-level named field: - -- The field needs to be either mandatory for all records or be frequently - present in well-known log and event formats (such as `Timestamp`) or is - expected to be often present in log records in upcoming logging systems (such - as `TraceId`). - -- The field’s semantics must be the same for all known log and event formats and - can be mapped directly and unambiguously to this data model. - -Both of the above conditions were required to give the field a place in the -top-level structure of the record. - ## Log and Event Record Definition [Appendix A](./data-model-appendix.md#appendix-a-example-mappings) contains many examples that show how @@ -430,14 +347,15 @@ when it is used to represent an unspecified severity. ### Field: `Body` -Type: [`any`](#type-any). +Type: [AnyValue](../common/README.md#anyvalue). Description: A value containing the body of the log record. Can be for example a human-readable string message (including multi-line) describing the event in a free form or it can be a structured data composed of arrays and maps of other -values. Body MUST support [`any` type](#type-any) to preserve the semantics of -structured logs emitted by the applications. Can vary for each occurrence of the -event coming from the same source. This field is optional. +values. Body MUST support [AnyValue](../common/README.md#anyvalue) +to preserve the semantics of structured logs emitted by the applications. +Can vary for each occurrence of the event coming from the same source. +This field is optional. ### Field: `Resource` @@ -464,15 +382,12 @@ they all have the same value of `InstrumentationScope`. This field is optional. ### Field: `Attributes` -Type: [`map`](#type-mapstring-any). +Type: [Attribute Collection](../common/README.md#attribute-collections). Description: Additional information about the specific event occurrence. Unlike the `Resource` field, which is fixed for a particular source, `Attributes` can vary for each occurrence of the event coming from the same source. Can contain information about the request context (other than [Trace Context Fields](#trace-context-fields)). -The log attribute model MUST support [`any` type](#type-any), -a superset of [standard Attribute](../common/README.md#attribute), -to preserve the semantics of structured attributes emitted by the applications. This field is optional. #### Errors and Exceptions diff --git a/specification/resource/data-model.md b/specification/resource/data-model.md index 60b924ddebb..6a9d579dc07 100644 --- a/specification/resource/data-model.md +++ b/specification/resource/data-model.md @@ -24,8 +24,8 @@ running in a container on Kubernetes, which is associated to a Pod running on a Node that is a VM but also is in a namespace and possibly is part of a Deployment. Resource could have attributes to denote information about the Container, the Pod, the Node, the VM or the Deployment. All of these help -identify what produced the telemetry. Note that there are certain "standard -attributes" that have prescribed meanings. +identify what produced the telemetry. Note that there are certain attributes +that have prescribed meanings. A resource is composed of 0 or more [`Entities`](../entities/README.md) and 0 or more attributes not associated with any entity. @@ -35,7 +35,7 @@ The data model below defines a logical model for an Resource (irrespective of th | Field | Type | Description | |------------|----------|-----------------| | Entities | set\ | Defines the set of Entities associated with this resource.

[Entity is defined here](../entities/data-model.md) | -| Attributes | map\ | Additional Attributes that identify the resource.

MUST not change during the lifetime of the resource.

Follows OpenTelemetry [Standard attribute definition](../common/README.md#standard-attribute). | +| Attributes | map\ | Additional Attributes that identify the resource.

MUST not change during the lifetime of the resource.

Follows OpenTelemetry [attribute definition](../common/README.md#attribute). | ## Identity diff --git a/specification/resource/sdk.md b/specification/resource/sdk.md index b7cb8cfa976..8a9a08b4cb0 100644 --- a/specification/resource/sdk.md +++ b/specification/resource/sdk.md @@ -13,7 +13,8 @@ For example, a process producing telemetry that is running in a container on Kubernetes has a Pod name, it is in a namespace and possibly is part of a Deployment which also has a name. All three of these attributes can be included in the `Resource`. Note that there are certain -["standard attributes"](https://github.com/open-telemetry/semantic-conventions/blob/main/docs/resource/README.md) that have prescribed meanings. +[attributes](https://github.com/open-telemetry/semantic-conventions/blob/main/docs/resource/README.md) +that have prescribed meanings. The primary purpose of resources as a first-class concept in the SDK is decoupling of discovery of resource information from exporters. This allows for