diff --git a/.chloggen/service-criticality-attribute.yaml b/.chloggen/service-criticality-attribute.yaml new file mode 100644 index 0000000000..ab8455bc57 --- /dev/null +++ b/.chloggen/service-criticality-attribute.yaml @@ -0,0 +1,24 @@ +# Use this changelog template to create an entry for release notes. +# +# If your change doesn't affect end users you should instead start +# your pull request title with [chore] or use the "Skip Changelog" label. + +# One of 'breaking', 'deprecation', 'new_component', 'enhancement', 'bug_fix' +change_type: "enhancement" + +# The name of the area of concern in the attributes-registry, (e.g. http, cloud, db) +component: "service" + +# A brief description of the change. Surround your text with quotes ("") if it needs to start with a backtick (`). +note: "Add `service.criticality` attribute to classify services based on operational importance" + +# Mandatory: One or more tracking issues related to the change. You can use the PR number here if no issue exists. +# The values here must be integers. +issues: [2986] + +# (Optional) One or more lines of additional information to render under the primary note. +# These lines will be padded with 2 spaces and then inserted directly into the document. +# Use pipe (|) for multiline entries. +subtext: | + This attribute enables observability platforms to implement criticality-aware tracing, monitoring, + and sampling strategies. Supports four levels: critical, high, medium, and low. diff --git a/areas.yaml b/areas.yaml index 562fc874fa..680fc01f3a 100644 --- a/areas.yaml +++ b/areas.yaml @@ -112,7 +112,7 @@ areas: - name: "Semantic Conventions: Resources and Entities" owner: - - name: "specs-semconv-maintainers" # TODO: Missing team user for entities? + - name: "specs-semconv-maintainers" # TODO: Missing team user for entities? github: specs-semconv-maintainers project: "https://github.com/open-telemetry/community/blob/main/projects/resources-and-entities.md" board: "https://github.com/orgs/open-telemetry/projects/85" @@ -165,7 +165,7 @@ areas: - name: "Semantic Conventions: FaaS" owner: - - name: "specs-semconv-maintainers" # TODO: Missing team user for faas? + - name: "specs-semconv-maintainers" # TODO: Missing team user for faas? github: specs-semconv-maintainers project: "https://github.com/open-telemetry/community/blob/main/projects/completed-projects/faas.md" board: "N/A" diff --git a/docs/registry/attributes/service.md b/docs/registry/attributes/service.md index 9b30736000..86bdc23aca 100644 --- a/docs/registry/attributes/service.md +++ b/docs/registry/attributes/service.md @@ -10,13 +10,13 @@ A service instance. **Attributes:** | Key | Stability | Value Type | Description | Example Values | -| --- | --- | --- | --- | --- | -| `service.instance.id` | ![Development](https://img.shields.io/badge/-development-blue) | string | The string ID of the service instance. [1] | `627cc493-f310-47de-96bd-71410b7dec09` | -| `service.name` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | string | Logical name of the service. [2] | `shoppingcart` | -| `service.namespace` | ![Development](https://img.shields.io/badge/-development-blue) | string | A namespace for `service.name`. [3] | `Shop` | +|---|---|---|---|---| +| `service.criticality` | ![Development](https://img.shields.io/badge/-development-blue) | string | The operational criticality of the service. [1] | `critical`; `high`; `medium`; `low` | | `service.version` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | string | The version string of the service API or implementation. The format is not defined by these conventions. | `2.0.0`; `a01dbef8a` | -**[1] `service.instance.id`:** MUST be unique for each instance of the same `service.namespace,service.name` pair (in other words +**[1] `service.criticality`:** This attribute enables classification of services based on their operational importance, allowing observability platforms to implement criticality-aware tracing, monitoring, and sampling strategies. By standardizing service criticality, organizations can implement adaptive sampling rates (e.g., 100% for critical, 10% for low-priority services), optimize telemetry costs by reducing data from non-critical services, improve incident response by surfacing critical service traces first, and enable better capacity planning and resource allocation. + +**[2] `service.instance.id`:** MUST be unique for each instance of the same `service.namespace,service.name` pair (in other words `service.namespace,service.name,service.instance.id` triplet MUST be globally unique). The ID helps to distinguish instances of the same service that exist at the same time (e.g. instances of a horizontally scaled service). @@ -43,6 +43,25 @@ However, Collectors can set the `service.instance.id` if they can unambiguously for that telemetry. This is typically the case for scraping receivers, as they know the target address and port. -**[2] `service.name`:** MUST be the same for all instances of horizontally scaled services. If the value was not specified, SDKs MUST fallback to `unknown_service:` concatenated with [`process.executable.name`](process.md), e.g. `unknown_service:bash`. If `process.executable.name` is not available, the value MUST be set to `unknown_service`. +**[3] `service.name`:** MUST be the same for all instances of horizontally scaled services. If the value was not specified, SDKs MUST fallback to `unknown_service:` concatenated with [`process.executable.name`](process.md), e.g. `unknown_service:bash`. If `process.executable.name` is not available, the value MUST be set to `unknown_service`. + +**[4] `service.namespace`:** A string value having a meaning that helps to distinguish a group of services, for example the team name that owns a group of services. `service.name` is expected to be unique within the same namespace. If `service.namespace` is not specified in the Resource then `service.name` is expected to be unique for all services that have no explicit namespace defined (so the empty/unspecified namespace is simply one more valid namespace). Zero-length namespace string is assumed equal to unspecified namespace. + +--- + +`service.criticality` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used. + +| Value | Description | Stability | +|---|---|---| +| `critical` | Service is business-critical; downtime directly impacts revenue, user experience, or core functionality. [5] | ![Development](https://img.shields.io/badge/-development-blue) | +| `high` | Service is important but has degradation tolerance or fallback mechanisms. [6] | ![Development](https://img.shields.io/badge/-development-blue) | +| `low` | Service is non-essential to core operations; used for background tasks or internal tools. [7] | ![Development](https://img.shields.io/badge/-development-blue) | +| `medium` | Service provides supplementary functionality; degradation has limited user impact. [8] | ![Development](https://img.shields.io/badge/-development-blue) | + +**[5]:** Examples include payment processing, authentication, and primary user-facing APIs. + +**[6]:** Examples include shopping cart, search, and recommendation engines. + +**[7]:** Examples include batch processors, cleanup jobs, and internal dashboards. -**[3] `service.namespace`:** A string value having a meaning that helps to distinguish a group of services, for example the team name that owns a group of services. `service.name` is expected to be unique within the same namespace. If `service.namespace` is not specified in the Resource then `service.name` is expected to be unique for all services that have no explicit namespace defined (so the empty/unspecified namespace is simply one more valid namespace). Zero-length namespace string is assumed equal to unspecified namespace. +**[8]:** Examples include analytics, reporting, and non-essential integrations. diff --git a/docs/registry/entities/service.md b/docs/registry/entities/service.md index 4e76d4d383..e304a0f66e 100644 --- a/docs/registry/entities/service.md +++ b/docs/registry/entities/service.md @@ -20,6 +20,7 @@ | Identity | [`service.name`](/docs/registry/attributes/service.md) | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | `Required` | string | Logical name of the service. [1] | `shoppingcart` | | Identity | [`service.instance.id`](/docs/registry/attributes/service.md) | ![Development](https://img.shields.io/badge/-development-blue) | `Recommended` | string | The string ID of the service instance. [2] | `627cc493-f310-47de-96bd-71410b7dec09` | | Identity | [`service.namespace`](/docs/registry/attributes/service.md) | ![Development](https://img.shields.io/badge/-development-blue) | `Recommended` | string | A namespace for `service.name`. [3] | `Shop` | +| Description | [`service.criticality`](/docs/registry/attributes/service.md) | ![Development](https://img.shields.io/badge/-development-blue) | `Recommended` | string | The operational criticality of the service. [4] | `critical`; `high`; `medium`; `low` | | Description | [`service.version`](/docs/registry/attributes/service.md) | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | `Recommended` | string | The version string of the service API or implementation. The format is not defined by these conventions. | `2.0.0`; `a01dbef8a` | **[1] `service.name`:** MUST be the same for all instances of horizontally scaled services. If the value was not specified, SDKs MUST fallback to `unknown_service:` concatenated with [`process.executable.name`](process.md), e.g. `unknown_service:bash`. If `process.executable.name` is not available, the value MUST be set to `unknown_service`. @@ -53,4 +54,6 @@ port. **[3] `service.namespace`:** A string value having a meaning that helps to distinguish a group of services, for example the team name that owns a group of services. `service.name` is expected to be unique within the same namespace. If `service.namespace` is not specified in the Resource then `service.name` is expected to be unique for all services that have no explicit namespace defined (so the empty/unspecified namespace is simply one more valid namespace). Zero-length namespace string is assumed equal to unspecified namespace. +**[4] `service.criticality`:** This attribute enables classification of services based on their operational importance, allowing observability platforms to implement criticality-aware tracing, monitoring, and sampling strategies. By standardizing service criticality, organizations can implement adaptive sampling rates (e.g., 100% for critical, 10% for low-priority services), optimize telemetry costs by reducing data from non-critical services, improve incident response by surfacing critical service traces first, and enable better capacity planning and resource allocation. + diff --git a/docs/resource/README.md b/docs/resource/README.md index 3579dc4a95..0e2e94e573 100644 --- a/docs/resource/README.md +++ b/docs/resource/README.md @@ -87,6 +87,7 @@ as specified in the [Resource SDK specification](https://github.com/open-telemet | Identity | [`service.name`](/docs/registry/attributes/service.md) | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | `Required` | string | Logical name of the service. [1] | `shoppingcart` | | Identity | [`service.instance.id`](/docs/registry/attributes/service.md) | ![Development](https://img.shields.io/badge/-development-blue) | `Recommended` | string | The string ID of the service instance. [2] | `627cc493-f310-47de-96bd-71410b7dec09` | | Identity | [`service.namespace`](/docs/registry/attributes/service.md) | ![Development](https://img.shields.io/badge/-development-blue) | `Recommended` | string | A namespace for `service.name`. [3] | `Shop` | +| Description | [`service.criticality`](/docs/registry/attributes/service.md) | ![Development](https://img.shields.io/badge/-development-blue) | `Recommended` | string | The operational criticality of the service. [4] | `critical`; `high`; `medium`; `low` | | Description | [`service.version`](/docs/registry/attributes/service.md) | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | `Recommended` | string | The version string of the service API or implementation. The format is not defined by these conventions. | `2.0.0`; `a01dbef8a` | **[1] `service.name`:** MUST be the same for all instances of horizontally scaled services. If the value was not specified, SDKs MUST fallback to `unknown_service:` concatenated with [`process.executable.name`](process.md), e.g. `unknown_service:bash`. If `process.executable.name` is not available, the value MUST be set to `unknown_service`. @@ -119,6 +120,8 @@ for that telemetry. This is typically the case for scraping receivers, as they k port. **[3] `service.namespace`:** A string value having a meaning that helps to distinguish a group of services, for example the team name that owns a group of services. `service.name` is expected to be unique within the same namespace. If `service.namespace` is not specified in the Resource then `service.name` is expected to be unique for all services that have no explicit namespace defined (so the empty/unspecified namespace is simply one more valid namespace). Zero-length namespace string is assumed equal to unspecified namespace. + +**[4] `service.criticality`:** This attribute enables classification of services based on their operational importance, allowing observability platforms to implement criticality-aware tracing, monitoring, and sampling strategies. By standardizing service criticality, organizations can implement adaptive sampling rates (e.g., 100% for critical, 10% for low-priority services), optimize telemetry costs by reducing data from non-critical services, improve incident response by surfacing critical service traces first, and enable better capacity planning and resource allocation. diff --git a/internal/tools/scripts/schema-diff/yaml/weaver.yaml b/internal/tools/scripts/schema-diff/yaml/weaver.yaml index 702545a9e1..4b45921cff 100644 --- a/internal/tools/scripts/schema-diff/yaml/weaver.yaml +++ b/internal/tools/scripts/schema-diff/yaml/weaver.yaml @@ -1,5 +1,5 @@ params: - next_version: "next_version_placeholder" # https://github.com/open-telemetry/weaver/issues/775 + next_version: "next_version_placeholder" # https://github.com/open-telemetry/weaver/issues/775 templates: - pattern: schema-diff.j2 filter: > diff --git a/model/service/entities.yaml b/model/service/entities.yaml index 319eb39eae..31ff65d094 100644 --- a/model/service/entities.yaml +++ b/model/service/entities.yaml @@ -15,3 +15,6 @@ groups: role: identifying - ref: service.instance.id role: identifying + - ref: service.criticality + requirement_level: recommended + role: descriptive diff --git a/model/service/registry.yaml b/model/service/registry.yaml index 9715378b19..9c649f8d16 100644 --- a/model/service/registry.yaml +++ b/model/service/registry.yaml @@ -69,3 +69,45 @@ groups: for that telemetry. This is typically the case for scraping receivers, as they know the target address and port. examples: ["627cc493-f310-47de-96bd-71410b7dec09"] + - id: service.criticality + type: + members: + - id: critical + value: 'critical' + brief: > + Service is business-critical; downtime directly impacts revenue, user experience, or core functionality. + note: > + Examples include payment processing, authentication, and primary user-facing APIs. + stability: development + - id: high + value: 'high' + brief: > + Service is important but has degradation tolerance or fallback mechanisms. + note: > + Examples include shopping cart, search, and recommendation engines. + stability: development + - id: medium + value: 'medium' + brief: > + Service provides supplementary functionality; degradation has limited user impact. + note: > + Examples include analytics, reporting, and non-essential integrations. + stability: development + - id: low + value: 'low' + brief: > + Service is non-essential to core operations; used for background tasks or internal tools. + note: > + Examples include batch processors, cleanup jobs, and internal dashboards. + stability: development + stability: development + brief: > + The operational criticality of the service. + note: > + This attribute enables classification of services based on their operational importance, + allowing observability platforms to implement criticality-aware tracing, monitoring, + and sampling strategies. By standardizing service criticality, organizations can implement + adaptive sampling rates (e.g., 100% for critical, 10% for low-priority services), optimize + telemetry costs by reducing data from non-critical services, improve incident response by + surfacing critical service traces first, and enable better capacity planning and resource allocation. + examples: ["critical", "high", "medium", "low"]