From fa3bfcef9f0f4cbaee930a366cbc391ee4f6efb1 Mon Sep 17 00:00:00 2001 From: Andrew Jandacek Date: Tue, 20 Jan 2026 11:24:20 +0100 Subject: [PATCH 01/46] create skeleton of OTel Signed-off-by: Andrew Jandacek --- ...vided-observability-signals-and-attributes.md | 2 ++ .../configuring-otel-deployment-attributes.md | 2 ++ .../configuring-otel-for-observability.md | 2 ++ .../configuring-otel-service-attributes.md | 1 + .../configuring-otel-zos-attributes.md | 1 + .../enabling-observability-in-zowe.yaml.md | 2 ++ .../observability/observability-outline.md | 16 ++++++++++++++++ .../observability/overview-of-observability.md | 0 .../sample-output-from-apiml-otel.md | 2 ++ .../observability/using-your-otel-metrics.md | 0 10 files changed, 28 insertions(+) create mode 100644 docs/user-guide/api-mediation/observability/apiml-provided-observability-signals-and-attributes.md create mode 100644 docs/user-guide/api-mediation/observability/configuring-otel-deployment-attributes.md create mode 100644 docs/user-guide/api-mediation/observability/configuring-otel-for-observability.md create mode 100644 docs/user-guide/api-mediation/observability/configuring-otel-service-attributes.md create mode 100644 docs/user-guide/api-mediation/observability/configuring-otel-zos-attributes.md create mode 100644 docs/user-guide/api-mediation/observability/enabling-observability-in-zowe.yaml.md create mode 100644 docs/user-guide/api-mediation/observability/observability-outline.md create mode 100644 docs/user-guide/api-mediation/observability/overview-of-observability.md create mode 100644 docs/user-guide/api-mediation/observability/sample-output-from-apiml-otel.md create mode 100644 docs/user-guide/api-mediation/observability/using-your-otel-metrics.md diff --git a/docs/user-guide/api-mediation/observability/apiml-provided-observability-signals-and-attributes.md b/docs/user-guide/api-mediation/observability/apiml-provided-observability-signals-and-attributes.md new file mode 100644 index 0000000000..946daf2492 --- /dev/null +++ b/docs/user-guide/api-mediation/observability/apiml-provided-observability-signals-and-attributes.md @@ -0,0 +1,2 @@ +# API ML Provided Observability Signals and Attributes + diff --git a/docs/user-guide/api-mediation/observability/configuring-otel-deployment-attributes.md b/docs/user-guide/api-mediation/observability/configuring-otel-deployment-attributes.md new file mode 100644 index 0000000000..0bf21ffe00 --- /dev/null +++ b/docs/user-guide/api-mediation/observability/configuring-otel-deployment-attributes.md @@ -0,0 +1,2 @@ +# Configure OpenTelemetry Deployment Attributes + diff --git a/docs/user-guide/api-mediation/observability/configuring-otel-for-observability.md b/docs/user-guide/api-mediation/observability/configuring-otel-for-observability.md new file mode 100644 index 0000000000..d9834788b2 --- /dev/null +++ b/docs/user-guide/api-mediation/observability/configuring-otel-for-observability.md @@ -0,0 +1,2 @@ +# Configure OpenTelemetry for API ML Observability + diff --git a/docs/user-guide/api-mediation/observability/configuring-otel-service-attributes.md b/docs/user-guide/api-mediation/observability/configuring-otel-service-attributes.md new file mode 100644 index 0000000000..5bafce791c --- /dev/null +++ b/docs/user-guide/api-mediation/observability/configuring-otel-service-attributes.md @@ -0,0 +1 @@ +# Configure OpenTelemetry Service Attributes diff --git a/docs/user-guide/api-mediation/observability/configuring-otel-zos-attributes.md b/docs/user-guide/api-mediation/observability/configuring-otel-zos-attributes.md new file mode 100644 index 0000000000..b0580e1c8a --- /dev/null +++ b/docs/user-guide/api-mediation/observability/configuring-otel-zos-attributes.md @@ -0,0 +1 @@ +# Configure OpenTelemetry z/OS Attributes diff --git a/docs/user-guide/api-mediation/observability/enabling-observability-in-zowe.yaml.md b/docs/user-guide/api-mediation/observability/enabling-observability-in-zowe.yaml.md new file mode 100644 index 0000000000..dc8e2eef46 --- /dev/null +++ b/docs/user-guide/api-mediation/observability/enabling-observability-in-zowe.yaml.md @@ -0,0 +1,2 @@ +# Enable API ML Observability in zowe.yaml + diff --git a/docs/user-guide/api-mediation/observability/observability-outline.md b/docs/user-guide/api-mediation/observability/observability-outline.md new file mode 100644 index 0000000000..33093b1d0a --- /dev/null +++ b/docs/user-guide/api-mediation/observability/observability-outline.md @@ -0,0 +1,16 @@ +# Outline of API ML Observability Topics + +The following files will be presented under Advanced server-side configuration under the **Install** tab: + +* Overview of Observability + * Configuring OpenTelemetry for Observability + * Configuring OpenTelemetry service attributes + * Configuring OpenTelemetry deployment attributes + * Configuring OpenTelemetry z/OS attributes + * Enabling observability in zowe.yaml + +The following files will be presented under Using Zowe API Mediation Layer under the **Use** tab: + +* Using your OpenTelemetry metrics + * Sample Output from API ML OpenTelemetry + * API ML Provided Observability Signals and Attributes \ No newline at end of file diff --git a/docs/user-guide/api-mediation/observability/overview-of-observability.md b/docs/user-guide/api-mediation/observability/overview-of-observability.md new file mode 100644 index 0000000000..e69de29bb2 diff --git a/docs/user-guide/api-mediation/observability/sample-output-from-apiml-otel.md b/docs/user-guide/api-mediation/observability/sample-output-from-apiml-otel.md new file mode 100644 index 0000000000..23372cdb9b --- /dev/null +++ b/docs/user-guide/api-mediation/observability/sample-output-from-apiml-otel.md @@ -0,0 +1,2 @@ +# Sample Output from API ML OpenTelemetry + diff --git a/docs/user-guide/api-mediation/observability/using-your-otel-metrics.md b/docs/user-guide/api-mediation/observability/using-your-otel-metrics.md new file mode 100644 index 0000000000..e69de29bb2 From 8b871cb1fbf4183ae9e3bf9936db8c9288b8ad8d Mon Sep 17 00:00:00 2001 From: Andrew Jandacek Date: Tue, 20 Jan 2026 11:40:53 +0100 Subject: [PATCH 02/46] add titles Signed-off-by: Andrew Jandacek --- .../api-mediation/observability/observability-outline.md | 2 +- .../api-mediation/observability/overview-of-observability.md | 2 ++ .../api-mediation/observability/using-your-otel-metrics.md | 2 ++ 3 files changed, 5 insertions(+), 1 deletion(-) diff --git a/docs/user-guide/api-mediation/observability/observability-outline.md b/docs/user-guide/api-mediation/observability/observability-outline.md index 33093b1d0a..21de6fc15c 100644 --- a/docs/user-guide/api-mediation/observability/observability-outline.md +++ b/docs/user-guide/api-mediation/observability/observability-outline.md @@ -11,6 +11,6 @@ The following files will be presented under Advanced server-side configuration u The following files will be presented under Using Zowe API Mediation Layer under the **Use** tab: -* Using your OpenTelemetry metrics +* Using your API ML OpenTelemetry metrics * Sample Output from API ML OpenTelemetry * API ML Provided Observability Signals and Attributes \ No newline at end of file diff --git a/docs/user-guide/api-mediation/observability/overview-of-observability.md b/docs/user-guide/api-mediation/observability/overview-of-observability.md index e69de29bb2..07421d0433 100644 --- a/docs/user-guide/api-mediation/observability/overview-of-observability.md +++ b/docs/user-guide/api-mediation/observability/overview-of-observability.md @@ -0,0 +1,2 @@ +# Overview of Observability + diff --git a/docs/user-guide/api-mediation/observability/using-your-otel-metrics.md b/docs/user-guide/api-mediation/observability/using-your-otel-metrics.md index e69de29bb2..afe340349d 100644 --- a/docs/user-guide/api-mediation/observability/using-your-otel-metrics.md +++ b/docs/user-guide/api-mediation/observability/using-your-otel-metrics.md @@ -0,0 +1,2 @@ +# Using Your API ML OpenTelemetry Metrics + From 11e7bbdd727569046083b2d4f2a821750d6bceda Mon Sep 17 00:00:00 2001 From: Andrew Jandacek Date: Tue, 20 Jan 2026 16:53:53 +0100 Subject: [PATCH 03/46] add initial draft of intro Signed-off-by: Andrew Jandacek --- .../api-mediation/observability/overview-of-observability.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/docs/user-guide/api-mediation/observability/overview-of-observability.md b/docs/user-guide/api-mediation/observability/overview-of-observability.md index 07421d0433..2e7c2a7d5b 100644 --- a/docs/user-guide/api-mediation/observability/overview-of-observability.md +++ b/docs/user-guide/api-mediation/observability/overview-of-observability.md @@ -1,2 +1,6 @@ # Overview of Observability + +Observability of functionalities in the Zowe API Mediation Layer (API ML) can be provided through integration with OpenTelemetry (OTel). This integration enables API ML to produce observability data, including metrics, logs, and traces, that describe runtime behavior, request processing, and service interactions. This observability data can be collected and exported to supported analysis tools, thereby making it possible for API ML users to monitor system activity, diagnose issues, and understand service behavior without requiring a specific observability vendor. + +Observability can be enabled and configured using API ML and OpenTelemetry settings in the zowe.yaml file to control which data is produced and where it is exported. \ No newline at end of file From 57ed363d015b7c26fe71f56a26082245c61cbfea Mon Sep 17 00:00:00 2001 From: Andrew Jandacek Date: Tue, 20 Jan 2026 16:57:26 +0100 Subject: [PATCH 04/46] grammar fix Signed-off-by: Andrew Jandacek --- .../api-mediation/observability/overview-of-observability.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/user-guide/api-mediation/observability/overview-of-observability.md b/docs/user-guide/api-mediation/observability/overview-of-observability.md index 2e7c2a7d5b..21e681ee24 100644 --- a/docs/user-guide/api-mediation/observability/overview-of-observability.md +++ b/docs/user-guide/api-mediation/observability/overview-of-observability.md @@ -3,4 +3,4 @@ Observability of functionalities in the Zowe API Mediation Layer (API ML) can be provided through integration with OpenTelemetry (OTel). This integration enables API ML to produce observability data, including metrics, logs, and traces, that describe runtime behavior, request processing, and service interactions. This observability data can be collected and exported to supported analysis tools, thereby making it possible for API ML users to monitor system activity, diagnose issues, and understand service behavior without requiring a specific observability vendor. -Observability can be enabled and configured using API ML and OpenTelemetry settings in the zowe.yaml file to control which data is produced and where it is exported. \ No newline at end of file +Observability can be enabled and configured using API ML and OpenTelemetry settings in the zowe.yaml file to control which data is produced and where data are exported. \ No newline at end of file From bd31f03e0edf8ca8c79731aa019844772f8be157 Mon Sep 17 00:00:00 2001 From: Andrew Jandacek Date: Wed, 21 Jan 2026 11:43:44 +0100 Subject: [PATCH 05/46] Add content for Resource Attributes and Enabling Observability Signed-off-by: Andrew Jandacek --- .../overview-of-observability.md | 54 ++++++++++++++++++- 1 file changed, 53 insertions(+), 1 deletion(-) diff --git a/docs/user-guide/api-mediation/observability/overview-of-observability.md b/docs/user-guide/api-mediation/observability/overview-of-observability.md index 21e681ee24..11d278f63b 100644 --- a/docs/user-guide/api-mediation/observability/overview-of-observability.md +++ b/docs/user-guide/api-mediation/observability/overview-of-observability.md @@ -3,4 +3,56 @@ Observability of functionalities in the Zowe API Mediation Layer (API ML) can be provided through integration with OpenTelemetry (OTel). This integration enables API ML to produce observability data, including metrics, logs, and traces, that describe runtime behavior, request processing, and service interactions. This observability data can be collected and exported to supported analysis tools, thereby making it possible for API ML users to monitor system activity, diagnose issues, and understand service behavior without requiring a specific observability vendor. -Observability can be enabled and configured using API ML and OpenTelemetry settings in the zowe.yaml file to control which data is produced and where data are exported. \ No newline at end of file +:::info +Required role: System administrator +::: + +Observability can be enabled and configured using API ML and OpenTelemetry settings in the zowe.yaml file to control which data is produced and where data is exported. + +By leveraging the OpenTelemetry (OTel) standard, API ML allows system administrators to monitor performance, diagnose latency issues, and understand resource utilization within a mainframe environment using industry-standard tools like Prometheus, Grafana, or Jaeger. + +:::note +Observability features are currently available exclusively for the API ML Modulith deployment. These features are not supported in the legacy microservice-based architecture of API ML. +::: + +## Resource Attributes + +In OpenTelemetry, a **Resource** represents the entity producing telemetry. For Zowe, this is the API ML single-service instance. Every signal (metric/trace) produced carries a set of attributes that identify this instance. The following attributes are integrated based on OpenTelemetry semantic conventions and z/OS-specific requirements: + +| Attribute | Description | Configuration Source | +| :--- | :--- | :--- | +| **`service.name`** | Logical name of the service. Identical for all instances in an HA deployment. | `zowe.yaml` | +| **`service.instance.id`** | Unique identifier for the instance (UUID recommended). | Auto-generated / `zowe.yaml` | +| **`service.namespace`** | Groups services (e.g., by LPAR or Team). | `zowe.yaml` | +| **`service.version`** | The version of the APIML component. | System metadata | +| **`deployment.environment`** | Environment type (e.g., production, test). | `zowe.yaml` | +| **`zos.smf.id`** | The SMF Identifier of the z/OS system. | System discovery | +| **`zos.sysplex.name`** | Name of the SYSPLEX. | System discovery | +| **`mainframe.lpar.name`** | Name of the LPAR hosting the process. | System discovery | +| **`os.type`** | Set to `zos`. | Static | +| **`process.pid`** | Address Space Identifier (ASID) on z/OS. | System discovery | + + +## Enabling Observability + +To enable observability, configure the OpenTelemetry exporter and resource attributes within the zowe.yaml configuration file. + +Add or update the following section in your zowe.yaml: + +``` +zowe: + components: + api-mediation-layer: + observability: + enabled: true # Master switch for OTel features + exporter: + otlp: + endpoint: "http://your-otel-collector:4317" # OTLP collector address + protocol: "grpc" + resource: + attributes: + service.name: "zowe-apiml" + service.namespace: "mainframe-lpar1" + deployment.environment: "production" + # Custom attributes can be added here +``` \ No newline at end of file From 01ca53b188248800609e82cc4b827710216fd02a Mon Sep 17 00:00:00 2001 From: Andrew Jandacek Date: Wed, 21 Jan 2026 14:12:59 +0100 Subject: [PATCH 06/46] add content to enable observability Signed-off-by: Andrew Jandacek --- .../enabling-observability-in-zowe.yaml.md | 48 +++++++++++++++++++ 1 file changed, 48 insertions(+) diff --git a/docs/user-guide/api-mediation/observability/enabling-observability-in-zowe.yaml.md b/docs/user-guide/api-mediation/observability/enabling-observability-in-zowe.yaml.md index dc8e2eef46..f1e146879e 100644 --- a/docs/user-guide/api-mediation/observability/enabling-observability-in-zowe.yaml.md +++ b/docs/user-guide/api-mediation/observability/enabling-observability-in-zowe.yaml.md @@ -1,2 +1,50 @@ # Enable API ML Observability in zowe.yaml +Review how to enable and configure the OpenTelemetry (OTel) integration within the Zowe API Mediation Layer (API ML) single-service deployment. Configure these parameters in `zowe.yaml` to enable API ML to export metrics, traces, and logs to an OpenTelemetry Collector. + +## Configuration Overview + +The observability configuration is located under the API Mediation Layer `component` section of the zowe.yaml, under which there are three observability properties: + +* **enabled** + Activates the OTel SDK. Set to `true` to initialize the OpenTelemetry SDK. + +* **exporter** +Defines where the data is sent. + + * **exporter.otlp.protocol** + The URL of your OTLP-compatible collector (e.g., z-Iris or Jaeger) + + * **exporter.otlp.protocol** + The protocol is either `grpc` or `http/protobuf`. + **Default:** `grcp` + +* **resource** +Defines the identity of the producer (Attributes). + + * **resource.attributes** + A collection of key-value pairs used to identify the telemetry source. + +To enable observability, update your `zowe.yaml` file with the following structure: + +``` +zowe: + components: + api-mediation-layer: + observability: + enabled: true + exporter: + otlp: + endpoint: "http://otel-collector.your.domain:4317" + protocol: "grpc" + timeout: 10000 + resource: + attributes: + service.name: "zowe-apiml" + service.namespace: "finance-production" + deployment.environment.name: "production" +``` + +## How the Export Works + +When enabled: `true` is set, the API ML single-service starts a background telemetry engine. It gathers internal metrics (like JVM heap or request latency) and bundles them with the Resource Attributes defined in your config. These bundles are then pushed via the OTLP Exporter to your specified endpoint. \ No newline at end of file From c3ac1fb822a7ea7c3d701666b3ec14a4a34666e3 Mon Sep 17 00:00:00 2001 From: Andrew Jandacek Date: Thu, 22 Jan 2026 10:46:28 +0100 Subject: [PATCH 07/46] add considerations for config service attributes Signed-off-by: Andrew Jandacek --- .../configuring-otel-service-attributes.md | 10 ++++++++++ .../observability/overview-of-observability.md | 4 ++-- 2 files changed, 12 insertions(+), 2 deletions(-) diff --git a/docs/user-guide/api-mediation/observability/configuring-otel-service-attributes.md b/docs/user-guide/api-mediation/observability/configuring-otel-service-attributes.md index 5bafce791c..b3c74cac55 100644 --- a/docs/user-guide/api-mediation/observability/configuring-otel-service-attributes.md +++ b/docs/user-guide/api-mediation/observability/configuring-otel-service-attributes.md @@ -1 +1,11 @@ # Configure OpenTelemetry Service Attributes + +Services are identified via the `service.name` and `service.namespace` properties. These properties create a unique identity for API ML instances. + + + +**Naming Conventions:** Provide guidance on naming services (e.g., zowe-apiml) to ensure consistency across HA (High Availability) deployments. + +**Instance Tracking:** Describe the use of `service.instance.id` and how to ensure uniqueness across instances. + + \ No newline at end of file diff --git a/docs/user-guide/api-mediation/observability/overview-of-observability.md b/docs/user-guide/api-mediation/observability/overview-of-observability.md index 11d278f63b..4391b286fd 100644 --- a/docs/user-guide/api-mediation/observability/overview-of-observability.md +++ b/docs/user-guide/api-mediation/observability/overview-of-observability.md @@ -1,7 +1,7 @@ # Overview of Observability -Observability of functionalities in the Zowe API Mediation Layer (API ML) can be provided through integration with OpenTelemetry (OTel). This integration enables API ML to produce observability data, including metrics, logs, and traces, that describe runtime behavior, request processing, and service interactions. This observability data can be collected and exported to supported analysis tools, thereby making it possible for API ML users to monitor system activity, diagnose issues, and understand service behavior without requiring a specific observability vendor. +Observability of functionalities in the Zowe API Mediation Layer (API ML) can be provided through integration with OpenTelemetry (OTel). This integration enables API ML to produce observability data, including [metrics](https://opentelemetry.io/docs/concepts/signals/metrics/), [logs](https://opentelemetry.io/docs/concepts/signals/logs/), and [traces](https://opentelemetry.io/docs/concepts/signals/traces/), that describe runtime behavior, request processing, and service interactions. This observability data can be collected and exported to supported analysis tools, thereby making it possible for API ML users to monitor system activity, diagnose issues, and understand service behavior without requiring a specific observability vendor. :::info Required role: System administrator @@ -12,7 +12,7 @@ Observability can be enabled and configured using API ML and OpenTelemetry setti By leveraging the OpenTelemetry (OTel) standard, API ML allows system administrators to monitor performance, diagnose latency issues, and understand resource utilization within a mainframe environment using industry-standard tools like Prometheus, Grafana, or Jaeger. :::note -Observability features are currently available exclusively for the API ML Modulith deployment. These features are not supported in the legacy microservice-based architecture of API ML. +Observability features are available exclusively for the API ML single-service deployment. These features are not supported in the legacy microservice-based architecture of API ML. ::: ## Resource Attributes From 6af333efcf56e45a59c8a9a83c8f10610204e418 Mon Sep 17 00:00:00 2001 From: Andrew Jandacek Date: Thu, 22 Jan 2026 11:48:42 +0100 Subject: [PATCH 08/46] add req service attribute list Signed-off-by: Andrew Jandacek --- .../configuring-otel-service-attributes.md | 56 ++++++++++++++++++- 1 file changed, 55 insertions(+), 1 deletion(-) diff --git a/docs/user-guide/api-mediation/observability/configuring-otel-service-attributes.md b/docs/user-guide/api-mediation/observability/configuring-otel-service-attributes.md index b3c74cac55..8e5be48c9f 100644 --- a/docs/user-guide/api-mediation/observability/configuring-otel-service-attributes.md +++ b/docs/user-guide/api-mediation/observability/configuring-otel-service-attributes.md @@ -8,4 +8,58 @@ Services are identified via the `service.name` and `service.namespace` propertie **Instance Tracking:** Describe the use of `service.instance.id` and how to ensure uniqueness across instances. - \ No newline at end of file + + +### Required Service Attributes + +The following attributes define the logical identity of the APIML Modulith. These attributes are automatically appended to all telemetry signals (metrics, traces, and logs) produced by the resource. + +**service.name** +Logical name of the service. Must be the same for all instances within the same HA deployment. Expected to be globally unique if `namespace` is not defined. + +**service.instance.id** +Must be unique for each instance of `service.name` and `service.namespace` pair. Automatically generated UUID is generally recommended to ensure uniqueness. + +**service.namespace** +The assigned value should help distinguish a group of services, such as the LPAR, or owner team. `service.name` is expected to be unique within the same `namespace`. + +**service.version** +The exact version of the service artifact, typically a semantic version (e.g., 1.2.3) or a build hash, used to identify the specific software release. + +**deployment.environment.name** +Name of the deployment environment (Example: dev, test, staging, or production). + +**zos.smf.id** +The System Management Facility (SMF) Identifier uniquely identifies a z/OS system within a Sysplex or mainframe environment and is used for system and performance analysis. + +**zos.sysplex.name** +The name of the Sysplex to which the z/OS system belongs. + +**mainframe.lpar.name** +Name of the logical partition that hosts a system with a mainframe operating system. + +**os.type** +The operating system type, which for this context should be set to `zos`. + +**os.version** +The version string of the operating system. On z/OS, this should be the release returned by the command d iplinfo. + +**process.command** +The command used to launch the process (i.e., the command name). On z/OS, this should be set to the name of the job used to start the z/OS system software. + +**process.pid** +Process identifier (PID). On z/OS, this should be set to the Address Space Identifier (ASID). + +#### Configuration Example (`zowe.yaml`) + +```yaml +zowe: + components: + api-mediation-layer: + observability: + resource: + attributes: + service.name: "zowe-apiml" + service.namespace: "mainframe-production" + # service.instance.id: "optional-custom-id" + From fea7e6e08d04ceb6547d09c57e5e5a84aee1a041 Mon Sep 17 00:00:00 2001 From: Andrew Jandacek Date: Thu, 22 Jan 2026 12:04:21 +0100 Subject: [PATCH 09/46] add links to outline and reorder using section per Richard's feedback Signed-off-by: Andrew Jandacek --- .../observability/observability-outline.md | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/docs/user-guide/api-mediation/observability/observability-outline.md b/docs/user-guide/api-mediation/observability/observability-outline.md index 21de6fc15c..a61dabe169 100644 --- a/docs/user-guide/api-mediation/observability/observability-outline.md +++ b/docs/user-guide/api-mediation/observability/observability-outline.md @@ -2,15 +2,15 @@ The following files will be presented under Advanced server-side configuration under the **Install** tab: -* Overview of Observability - * Configuring OpenTelemetry for Observability - * Configuring OpenTelemetry service attributes - * Configuring OpenTelemetry deployment attributes - * Configuring OpenTelemetry z/OS attributes - * Enabling observability in zowe.yaml +* [Overview of Observability](overview-of-observability.md) + * [Configuring OpenTelemetry for Observability](configuring-otel-for-observability.md) + * [Configuring OpenTelemetry service attributes](configuring-otel-service-attributes.md) + * [Configuring OpenTelemetry deployment attributes](configuring-otel-deployment-attributes.md) + * [Configuring OpenTelemetry z/OS attributes](configuring-otel-zos-attributes.md) + * [Enabling observability in zowe.yaml](enabling-observability-in-zowe.yaml.md) The following files will be presented under Using Zowe API Mediation Layer under the **Use** tab: -* Using your API ML OpenTelemetry metrics - * Sample Output from API ML OpenTelemetry - * API ML Provided Observability Signals and Attributes \ No newline at end of file +* [Using your API ML OpenTelemetry metrics](using-your-otel-metrics.md) + * [API ML Provided Observability Signals and Attributes](apiml-provided-observability-signals-and-attributes.md) + * [Sample Output from API ML OpenTelemetry](sample-output-from-apiml-otel.md) \ No newline at end of file From 391510360b45c30d38e61eba8b3b6d02c389d2db Mon Sep 17 00:00:00 2001 From: Andrew Jandacek Date: Thu, 22 Jan 2026 14:39:07 +0100 Subject: [PATCH 10/46] add details Signed-off-by: Andrew Jandacek --- .../enabling-observability-in-zowe.yaml.md | 6 +++- .../overview-of-observability.md | 35 ++++++++++++++++++- 2 files changed, 39 insertions(+), 2 deletions(-) diff --git a/docs/user-guide/api-mediation/observability/enabling-observability-in-zowe.yaml.md b/docs/user-guide/api-mediation/observability/enabling-observability-in-zowe.yaml.md index f1e146879e..1cea1f311e 100644 --- a/docs/user-guide/api-mediation/observability/enabling-observability-in-zowe.yaml.md +++ b/docs/user-guide/api-mediation/observability/enabling-observability-in-zowe.yaml.md @@ -47,4 +47,8 @@ zowe: ## How the Export Works -When enabled: `true` is set, the API ML single-service starts a background telemetry engine. It gathers internal metrics (like JVM heap or request latency) and bundles them with the Resource Attributes defined in your config. These bundles are then pushed via the OTLP Exporter to your specified endpoint. \ No newline at end of file +When `enabled: true` is set, the API ML single-service starts a background telemetry engine. This engine gathers internal metrics (like JVM heap or request latency) and bundles these metrics with the Resource Attributes defined in your config. These bundles are then pushed by means of the OTLP Exporter to your specified endpoint. + +:::note +If the endpoint is unreachable, API ML logs a warning, but service traffic is not interrupted. It is recommended to use a local OTel collector to minimize network latency. +::: \ No newline at end of file diff --git a/docs/user-guide/api-mediation/observability/overview-of-observability.md b/docs/user-guide/api-mediation/observability/overview-of-observability.md index 4391b286fd..29da34f7ea 100644 --- a/docs/user-guide/api-mediation/observability/overview-of-observability.md +++ b/docs/user-guide/api-mediation/observability/overview-of-observability.md @@ -55,4 +55,37 @@ zowe: service.namespace: "mainframe-lpar1" deployment.environment: "production" # Custom attributes can be added here -``` \ No newline at end of file +``` + +## Telemetry Data Produced + +Zowe API ML produces several categories of data out-of-the-box via OpenTelemetry. + +### Out-of-the-box (Standard OTel) + +* **JVM Metrics:** Memory usage (heap/non-heap), Garbage Collection (GC) frequency and duration, thread counts, and class loading. + +* **System Metrics:** CPU usage (System vs. Process) and File Descriptor usage. + +* **HTTP Metrics:** Request latency, throughput, and error rates (4xx/5xx) for all API traffic passing through the Modulith. + + + + + +The following attributes are automatically captured by the APIML Modulith to ensure mainframe-inclusive observability: + +* `zos.smf.id` +Unique identifier for the z/OS system. + +* `zos.sysplex.name` +The SYSPLEX cluster name. + +* `os.version` +The release version (returned by D IPLINFO). + +* `process.command` +The Job name used to start the Zowe instance. + +* `process.pid` +The Address Space Identifier (ASID). \ No newline at end of file From 88763bed42b78988ce68a6128c7a4aea60cc5cae Mon Sep 17 00:00:00 2001 From: Andrew Jandacek Date: Thu, 22 Jan 2026 16:28:05 +0100 Subject: [PATCH 11/46] initial draft of process and properties for acquiring z/OS Attributes Signed-off-by: Andrew Jandacek --- .../configuring-otel-service-attributes.md | 51 ++++---------- .../configuring-otel-zos-attributes.md | 54 +++++++++++++++ .../enabling-observability-in-zowe.yaml.md | 28 ++++---- .../overview-of-observability.md | 66 +++++++------------ 4 files changed, 105 insertions(+), 94 deletions(-) diff --git a/docs/user-guide/api-mediation/observability/configuring-otel-service-attributes.md b/docs/user-guide/api-mediation/observability/configuring-otel-service-attributes.md index 8e5be48c9f..a465f50cdf 100644 --- a/docs/user-guide/api-mediation/observability/configuring-otel-service-attributes.md +++ b/docs/user-guide/api-mediation/observability/configuring-otel-service-attributes.md @@ -12,54 +12,29 @@ Services are identified via the `service.name` and `service.namespace` propertie ### Required Service Attributes -The following attributes define the logical identity of the APIML Modulith. These attributes are automatically appended to all telemetry signals (metrics, traces, and logs) produced by the resource. +The following attributes are required to define the logical identity of the API ML. These attributes are automatically appended to all telemetry signals (metrics, traces, and logs) produced by the resource: -**service.name** -Logical name of the service. Must be the same for all instances within the same HA deployment. Expected to be globally unique if `namespace` is not defined. -**service.instance.id** +* **service.name** (Required) +Logical name of the service. Must be the same for all instances within the same HA deployment. Expected to be globally unique if `namespace` is not defined. + +* **service.instance.id** (Required) Must be unique for each instance of `service.name` and `service.namespace` pair. Automatically generated UUID is generally recommended to ensure uniqueness. -**service.namespace** +* **service.namespace** (Required) The assigned value should help distinguish a group of services, such as the LPAR, or owner team. `service.name` is expected to be unique within the same `namespace`. -**service.version** +* **service.version** (Required) The exact version of the service artifact, typically a semantic version (e.g., 1.2.3) or a build hash, used to identify the specific software release. -**deployment.environment.name** -Name of the deployment environment (Example: dev, test, staging, or production). - -**zos.smf.id** -The System Management Facility (SMF) Identifier uniquely identifies a z/OS system within a Sysplex or mainframe environment and is used for system and performance analysis. - -**zos.sysplex.name** -The name of the Sysplex to which the z/OS system belongs. - -**mainframe.lpar.name** -Name of the logical partition that hosts a system with a mainframe operating system. - -**os.type** -The operating system type, which for this context should be set to `zos`. - -**os.version** -The version string of the operating system. On z/OS, this should be the release returned by the command d iplinfo. - -**process.command** -The command used to launch the process (i.e., the command name). On z/OS, this should be set to the name of the job used to start the z/OS system software. - -**process.pid** -Process identifier (PID). On z/OS, this should be set to the Address Space Identifier (ASID). - #### Configuration Example (`zowe.yaml`) ```yaml zowe: - components: - api-mediation-layer: - observability: - resource: - attributes: - service.name: "zowe-apiml" - service.namespace: "mainframe-production" - # service.instance.id: "optional-custom-id" + observability: + resource: + attributes: + service.name: "zowe-apiml" + service.namespace: "mainframe-production" + # service.instance.id: "optional-custom-id" diff --git a/docs/user-guide/api-mediation/observability/configuring-otel-zos-attributes.md b/docs/user-guide/api-mediation/observability/configuring-otel-zos-attributes.md index b0580e1c8a..2fd7efae51 100644 --- a/docs/user-guide/api-mediation/observability/configuring-otel-zos-attributes.md +++ b/docs/user-guide/api-mediation/observability/configuring-otel-zos-attributes.md @@ -1 +1,55 @@ # Configure OpenTelemetry z/OS Attributes + +z/OS-specific resource attributes for API ML provide essential mainframe context to your telemetry data, allowing you to correlate metrics and traces with specific system identifiers such as SMF IDs, Sysplex names, and LPARs. By providing z/OS platform context, mainframe performance data can be integrated into distributed observability backends. + +The z/OS attributes are primarily populated through an automated System Discovery process. Upon the initialization of the single-service deployment of API ML, the integrated OpenTelemetry SDK executes platform-specific calls to query z/OS Control Blocks and system variables. This process identifies the current execution environment by retrieving values such as the Address Space Identifier (ASID), which is mapped to `process.pid`, and the system release level via `D IPLINFO` for `os.version`. If these attributes are already defined in the zowe.yaml configuration file, the discovery engine treats the manual entries as overrides, ensuring that user-defined values take precedence over detected system defaults. + +These attributes provide environmental context specific to the IBM z/OS platform. + +## z/OS Attribute Reference + +The following attributes are captured to describe the mainframe environment: + +* **zos.smf.id** +The System Management Facility (SMF) Identifier that uniquely identifies a z/OS system within a SYSPLEX. +Configuration Source: System discovery + +* **zos.sysplex.name** +The name of the SYSPLEX to which the z/OS system belongs. +Configuration Source: System discovery + +* **mainframe.lpar.name** +Name of the logical partition (LPAR) that hosts the z/OS system. +Configuration Source: System discovery + +* **os.type** +The operating system type, set to zos. +Configuration Source: Static + +* **os.version** +The version string of the operating system (e.g., the release returned by D IPLINFO). +Configuration Source: System discovery + +* **process.command** +The command or JOB name used to launch the Zowe process. +Configuration Source: System discovery + +* **process.pid** +The Process Identifier, which on z/OS is set to the Address Space Identifier (ASID). +Configuration Source: System discovery + +## Overriding Discovered Attributes in zowe.yaml + +While the discovery process handles most identifiers automatically, you may occasionally need to provide a manual override (for example, in shared environments where you wish to report a custom logical LPAR name). This is performed in the `resource.attributes` section of your zowe.yaml: + +``` +zowe: + observability: + enabled: true + resource: + attributes: + # Overriding discovered z/OS attributes + zos.smf.id: "MVS1" + zos.sysplex.name: "LOCALPLX" + mainframe.lpar.name: "PRODLPAR" +``` diff --git a/docs/user-guide/api-mediation/observability/enabling-observability-in-zowe.yaml.md b/docs/user-guide/api-mediation/observability/enabling-observability-in-zowe.yaml.md index 1cea1f311e..644bc6e273 100644 --- a/docs/user-guide/api-mediation/observability/enabling-observability-in-zowe.yaml.md +++ b/docs/user-guide/api-mediation/observability/enabling-observability-in-zowe.yaml.md @@ -25,24 +25,22 @@ Defines the identity of the producer (Attributes). * **resource.attributes** A collection of key-value pairs used to identify the telemetry source. -To enable observability, update your `zowe.yaml` file with the following structure: +To enable observability, configure the OpenTelemetry exporter and resource attributes within your `zowe.yaml` file with the following structure: ``` zowe: - components: - api-mediation-layer: - observability: - enabled: true - exporter: - otlp: - endpoint: "http://otel-collector.your.domain:4317" - protocol: "grpc" - timeout: 10000 - resource: - attributes: - service.name: "zowe-apiml" - service.namespace: "finance-production" - deployment.environment.name: "production" + observability: + enabled: true + exporter: + otlp: + endpoint: "http://otel-collector.your.domain:4317" + protocol: "grpc" + timeout: 10000 + resource: + attributes: + service.name: "zowe-apiml" + service.namespace: "finance-production" + deployment.environment.name: "production" ``` ## How the Export Works diff --git a/docs/user-guide/api-mediation/observability/overview-of-observability.md b/docs/user-guide/api-mediation/observability/overview-of-observability.md index 29da34f7ea..0c621fa5c6 100644 --- a/docs/user-guide/api-mediation/observability/overview-of-observability.md +++ b/docs/user-guide/api-mediation/observability/overview-of-observability.md @@ -7,7 +7,7 @@ Observability of functionalities in the Zowe API Mediation Layer (API ML) can be Required role: System administrator ::: -Observability can be enabled and configured using API ML and OpenTelemetry settings in the zowe.yaml file to control which data is produced and where data is exported. +Observability can be enabled and configured using API ML and OpenTelemetry settings in the zowe.yaml file. You can also specify where data is exported. By leveraging the OpenTelemetry (OTel) standard, API ML allows system administrators to monitor performance, diagnose latency issues, and understand resource utilization within a mainframe environment using industry-standard tools like Prometheus, Grafana, or Jaeger. @@ -17,45 +17,29 @@ Observability features are available exclusively for the API ML single-service d ## Resource Attributes -In OpenTelemetry, a **Resource** represents the entity producing telemetry. For Zowe, this is the API ML single-service instance. Every signal (metric/trace) produced carries a set of attributes that identify this instance. The following attributes are integrated based on OpenTelemetry semantic conventions and z/OS-specific requirements: - -| Attribute | Description | Configuration Source | -| :--- | :--- | :--- | -| **`service.name`** | Logical name of the service. Identical for all instances in an HA deployment. | `zowe.yaml` | -| **`service.instance.id`** | Unique identifier for the instance (UUID recommended). | Auto-generated / `zowe.yaml` | -| **`service.namespace`** | Groups services (e.g., by LPAR or Team). | `zowe.yaml` | -| **`service.version`** | The version of the APIML component. | System metadata | -| **`deployment.environment`** | Environment type (e.g., production, test). | `zowe.yaml` | -| **`zos.smf.id`** | The SMF Identifier of the z/OS system. | System discovery | -| **`zos.sysplex.name`** | Name of the SYSPLEX. | System discovery | -| **`mainframe.lpar.name`** | Name of the LPAR hosting the process. | System discovery | -| **`os.type`** | Set to `zos`. | Static | -| **`process.pid`** | Address Space Identifier (ASID) on z/OS. | System discovery | - - -## Enabling Observability - -To enable observability, configure the OpenTelemetry exporter and resource attributes within the zowe.yaml configuration file. - -Add or update the following section in your zowe.yaml: - -``` -zowe: - components: - api-mediation-layer: - observability: - enabled: true # Master switch for OTel features - exporter: - otlp: - endpoint: "http://your-otel-collector:4317" # OTLP collector address - protocol: "grpc" - resource: - attributes: - service.name: "zowe-apiml" - service.namespace: "mainframe-lpar1" - deployment.environment: "production" - # Custom attributes can be added here -``` +In OpenTelemetry, a **Resource** represents the entity producing telemetry. For Zowe, this is the API ML single-service instance. Every signal (metric/trace/log) produced carries a set of attributes that identify a specific instance. + +To organize the OpenTelemetry resource attributes for the Zowe API ML are organized into three logical groups: Service, z/OS, and Deployment. + +This categorization follows the [OpenTelemetry Semantic Conventions](https://opentelemetry.io/docs/specs/semconv/resource/) to ensure that the telemetry produced by Zowe is consistent with industry standards and easily consumable by monitoring backends. + +### Attribute Categories + +* **Service Attributes** +These identify the logical entity producing the data. They are used to group telemetry from all instances of the API ML into a single "service" view in your monitoring tools. + +For details about Service Attributes, see [Configuring OpenTelemetry Service Attributes](configuring-otel-service-attributes.md). + +* **z/OS Attributes** +These provide critical mainframe context. They identify the specific physical and logical environment (LPAR, Sysplex, and OS version) where the process is running, which is essential for mainframe-specific performance analysis. + +For details about z/OS Attributes, see [Configuring OpenTelemetry z/OS Attributes](configuring-otel-zos-attributes.md). + +* **Deployment Attributes:** +These describe the lifecycle stage of the service. They allow you to filter telemetry data by environment (e.g., distinguishing production issues from test environment noise). + +For details about Deployment Attributes, see [Configuring OpenTelemetry Deployment Attributes](configuring-otel-deployment-attributes.md). + ## Telemetry Data Produced @@ -73,7 +57,7 @@ Zowe API ML produces several categories of data out-of-the-box via OpenTelemetry -The following attributes are automatically captured by the APIML Modulith to ensure mainframe-inclusive observability: +The following attributes are automatically captured by the API ML single-service deployment to ensure mainframe-inclusive observability: * `zos.smf.id` Unique identifier for the z/OS system. From 7b62f21bfb688370613a4203d297c5d1b4fbba63 Mon Sep 17 00:00:00 2001 From: Andrew Jandacek Date: Thu, 22 Jan 2026 16:40:31 +0100 Subject: [PATCH 12/46] remove unneeded file Signed-off-by: Andrew Jandacek --- .../configuring-otel-for-observability.md | 2 -- .../enabling-observability-in-zowe.yaml.md | 2 +- .../observability/observability-outline.md | 9 ++++----- .../overview-of-observability.md | 19 ------------------- 4 files changed, 5 insertions(+), 27 deletions(-) delete mode 100644 docs/user-guide/api-mediation/observability/configuring-otel-for-observability.md diff --git a/docs/user-guide/api-mediation/observability/configuring-otel-for-observability.md b/docs/user-guide/api-mediation/observability/configuring-otel-for-observability.md deleted file mode 100644 index d9834788b2..0000000000 --- a/docs/user-guide/api-mediation/observability/configuring-otel-for-observability.md +++ /dev/null @@ -1,2 +0,0 @@ -# Configure OpenTelemetry for API ML Observability - diff --git a/docs/user-guide/api-mediation/observability/enabling-observability-in-zowe.yaml.md b/docs/user-guide/api-mediation/observability/enabling-observability-in-zowe.yaml.md index 644bc6e273..fed7b26486 100644 --- a/docs/user-guide/api-mediation/observability/enabling-observability-in-zowe.yaml.md +++ b/docs/user-guide/api-mediation/observability/enabling-observability-in-zowe.yaml.md @@ -1,4 +1,4 @@ -# Enable API ML Observability in zowe.yaml +# Enabling API ML Observability in zowe.yaml Review how to enable and configure the OpenTelemetry (OTel) integration within the Zowe API Mediation Layer (API ML) single-service deployment. Configure these parameters in `zowe.yaml` to enable API ML to export metrics, traces, and logs to an OpenTelemetry Collector. diff --git a/docs/user-guide/api-mediation/observability/observability-outline.md b/docs/user-guide/api-mediation/observability/observability-outline.md index a61dabe169..46873bff15 100644 --- a/docs/user-guide/api-mediation/observability/observability-outline.md +++ b/docs/user-guide/api-mediation/observability/observability-outline.md @@ -3,11 +3,10 @@ The following files will be presented under Advanced server-side configuration under the **Install** tab: * [Overview of Observability](overview-of-observability.md) - * [Configuring OpenTelemetry for Observability](configuring-otel-for-observability.md) - * [Configuring OpenTelemetry service attributes](configuring-otel-service-attributes.md) - * [Configuring OpenTelemetry deployment attributes](configuring-otel-deployment-attributes.md) - * [Configuring OpenTelemetry z/OS attributes](configuring-otel-zos-attributes.md) - * [Enabling observability in zowe.yaml](enabling-observability-in-zowe.yaml.md) + * [Configuring OpenTelemetry service attributes](configuring-otel-service-attributes.md) + * [Configuring OpenTelemetry deployment attributes](configuring-otel-deployment-attributes.md) + * [Configuring OpenTelemetry z/OS attributes](configuring-otel-zos-attributes.md) + * [Enabling observability in zowe.yaml](enabling-observability-in-zowe.yaml.md) The following files will be presented under Using Zowe API Mediation Layer under the **Use** tab: diff --git a/docs/user-guide/api-mediation/observability/overview-of-observability.md b/docs/user-guide/api-mediation/observability/overview-of-observability.md index 0c621fa5c6..66f8f36c91 100644 --- a/docs/user-guide/api-mediation/observability/overview-of-observability.md +++ b/docs/user-guide/api-mediation/observability/overview-of-observability.md @@ -40,7 +40,6 @@ These describe the lifecycle stage of the service. They allow you to filter tele For details about Deployment Attributes, see [Configuring OpenTelemetry Deployment Attributes](configuring-otel-deployment-attributes.md). - ## Telemetry Data Produced Zowe API ML produces several categories of data out-of-the-box via OpenTelemetry. @@ -55,21 +54,3 @@ Zowe API ML produces several categories of data out-of-the-box via OpenTelemetry - - -The following attributes are automatically captured by the API ML single-service deployment to ensure mainframe-inclusive observability: - -* `zos.smf.id` -Unique identifier for the z/OS system. - -* `zos.sysplex.name` -The SYSPLEX cluster name. - -* `os.version` -The release version (returned by D IPLINFO). - -* `process.command` -The Job name used to start the Zowe instance. - -* `process.pid` -The Address Space Identifier (ASID). \ No newline at end of file From f3536e2692b89f975aa0b06db8326e7033328b79 Mon Sep 17 00:00:00 2001 From: Andrew Jandacek Date: Thu, 22 Jan 2026 16:44:03 +0100 Subject: [PATCH 13/46] cange overview title and restructure outline Signed-off-by: Andrew Jandacek --- ....md => configuring-apiml-observability-via-opentelemetry.md} | 2 +- .../api-mediation/observability/observability-outline.md | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) rename docs/user-guide/api-mediation/observability/{overview-of-observability.md => configuring-apiml-observability-via-opentelemetry.md} (98%) diff --git a/docs/user-guide/api-mediation/observability/overview-of-observability.md b/docs/user-guide/api-mediation/observability/configuring-apiml-observability-via-opentelemetry.md similarity index 98% rename from docs/user-guide/api-mediation/observability/overview-of-observability.md rename to docs/user-guide/api-mediation/observability/configuring-apiml-observability-via-opentelemetry.md index 66f8f36c91..6832899e3e 100644 --- a/docs/user-guide/api-mediation/observability/overview-of-observability.md +++ b/docs/user-guide/api-mediation/observability/configuring-apiml-observability-via-opentelemetry.md @@ -1,4 +1,4 @@ -# Overview of Observability +# Configuring API ML Observability via OpenTelemetry Observability of functionalities in the Zowe API Mediation Layer (API ML) can be provided through integration with OpenTelemetry (OTel). This integration enables API ML to produce observability data, including [metrics](https://opentelemetry.io/docs/concepts/signals/metrics/), [logs](https://opentelemetry.io/docs/concepts/signals/logs/), and [traces](https://opentelemetry.io/docs/concepts/signals/traces/), that describe runtime behavior, request processing, and service interactions. This observability data can be collected and exported to supported analysis tools, thereby making it possible for API ML users to monitor system activity, diagnose issues, and understand service behavior without requiring a specific observability vendor. diff --git a/docs/user-guide/api-mediation/observability/observability-outline.md b/docs/user-guide/api-mediation/observability/observability-outline.md index 46873bff15..858d03bb19 100644 --- a/docs/user-guide/api-mediation/observability/observability-outline.md +++ b/docs/user-guide/api-mediation/observability/observability-outline.md @@ -2,7 +2,7 @@ The following files will be presented under Advanced server-side configuration under the **Install** tab: -* [Overview of Observability](overview-of-observability.md) +* [Configuring API ML Observability via OpenTelemetry](configuring-apiml-observability-via-opentelemetry.md) * [Configuring OpenTelemetry service attributes](configuring-otel-service-attributes.md) * [Configuring OpenTelemetry deployment attributes](configuring-otel-deployment-attributes.md) * [Configuring OpenTelemetry z/OS attributes](configuring-otel-zos-attributes.md) From 262360692b5a32aea4c9f96c14538ca5e8083e0b Mon Sep 17 00:00:00 2001 From: Andrew Jandacek Date: Thu, 22 Jan 2026 16:52:03 +0100 Subject: [PATCH 14/46] reorder attribute categories Signed-off-by: Andrew Jandacek --- ...figuring-apiml-observability-via-opentelemetry.md | 12 ++++++------ .../observability/configuring-otel-zos-attributes.md | 2 +- .../observability/observability-outline.md | 2 +- 3 files changed, 8 insertions(+), 8 deletions(-) diff --git a/docs/user-guide/api-mediation/observability/configuring-apiml-observability-via-opentelemetry.md b/docs/user-guide/api-mediation/observability/configuring-apiml-observability-via-opentelemetry.md index 6832899e3e..0862b6608e 100644 --- a/docs/user-guide/api-mediation/observability/configuring-apiml-observability-via-opentelemetry.md +++ b/docs/user-guide/api-mediation/observability/configuring-apiml-observability-via-opentelemetry.md @@ -19,7 +19,7 @@ Observability features are available exclusively for the API ML single-service d In OpenTelemetry, a **Resource** represents the entity producing telemetry. For Zowe, this is the API ML single-service instance. Every signal (metric/trace/log) produced carries a set of attributes that identify a specific instance. -To organize the OpenTelemetry resource attributes for the Zowe API ML are organized into three logical groups: Service, z/OS, and Deployment. +OpenTelemetry resource attributes for the Zowe API ML are organized into three logical groups: Service, Deployment, and z/OS. This categorization follows the [OpenTelemetry Semantic Conventions](https://opentelemetry.io/docs/specs/semconv/resource/) to ensure that the telemetry produced by Zowe is consistent with industry standards and easily consumable by monitoring backends. @@ -30,16 +30,16 @@ These identify the logical entity producing the data. They are used to group tel For details about Service Attributes, see [Configuring OpenTelemetry Service Attributes](configuring-otel-service-attributes.md). -* **z/OS Attributes** -These provide critical mainframe context. They identify the specific physical and logical environment (LPAR, Sysplex, and OS version) where the process is running, which is essential for mainframe-specific performance analysis. - -For details about z/OS Attributes, see [Configuring OpenTelemetry z/OS Attributes](configuring-otel-zos-attributes.md). - * **Deployment Attributes:** These describe the lifecycle stage of the service. They allow you to filter telemetry data by environment (e.g., distinguishing production issues from test environment noise). For details about Deployment Attributes, see [Configuring OpenTelemetry Deployment Attributes](configuring-otel-deployment-attributes.md). +* **z/OS Attributes** +These provide critical mainframe context. They identify the specific physical and logical environment (LPAR, Sysplex, and OS version) where the process is running, which is essential for mainframe-specific performance analysis. + +For details about z/OS Attributes, see [Configuring OpenTelemetry z/OS Attributes](configuring-otel-zos-attributes.md). + ## Telemetry Data Produced Zowe API ML produces several categories of data out-of-the-box via OpenTelemetry. diff --git a/docs/user-guide/api-mediation/observability/configuring-otel-zos-attributes.md b/docs/user-guide/api-mediation/observability/configuring-otel-zos-attributes.md index 2fd7efae51..c93a2bff75 100644 --- a/docs/user-guide/api-mediation/observability/configuring-otel-zos-attributes.md +++ b/docs/user-guide/api-mediation/observability/configuring-otel-zos-attributes.md @@ -8,7 +8,7 @@ These attributes provide environmental context specific to the IBM z/OS platform ## z/OS Attribute Reference -The following attributes are captured to describe the mainframe environment: +The following attributes are captured during system discovery to describe the mainframe environment: * **zos.smf.id** The System Management Facility (SMF) Identifier that uniquely identifies a z/OS system within a SYSPLEX. diff --git a/docs/user-guide/api-mediation/observability/observability-outline.md b/docs/user-guide/api-mediation/observability/observability-outline.md index 858d03bb19..04a9407a56 100644 --- a/docs/user-guide/api-mediation/observability/observability-outline.md +++ b/docs/user-guide/api-mediation/observability/observability-outline.md @@ -6,7 +6,7 @@ The following files will be presented under Advanced server-side configuration u * [Configuring OpenTelemetry service attributes](configuring-otel-service-attributes.md) * [Configuring OpenTelemetry deployment attributes](configuring-otel-deployment-attributes.md) * [Configuring OpenTelemetry z/OS attributes](configuring-otel-zos-attributes.md) - * [Enabling observability in zowe.yaml](enabling-observability-in-zowe.yaml.md) + * [Enabling Observability in zowe.yaml](enabling-observability-in-zowe.yaml.md) The following files will be presented under Using Zowe API Mediation Layer under the **Use** tab: From c4c3cd2ebf6d882fd2d2804bbd5b7e3c44b3cb3a Mon Sep 17 00:00:00 2001 From: Andrew Jandacek Date: Fri, 23 Jan 2026 12:16:35 +0100 Subject: [PATCH 15/46] add details about signals Signed-off-by: Andrew Jandacek --- ...ed-observability-signals-and-attributes.md | 14 ++++ ...g-apiml-observability-via-opentelemetry.md | 64 +++++++++++++++++-- 2 files changed, 71 insertions(+), 7 deletions(-) diff --git a/docs/user-guide/api-mediation/observability/apiml-provided-observability-signals-and-attributes.md b/docs/user-guide/api-mediation/observability/apiml-provided-observability-signals-and-attributes.md index 946daf2492..97b956631b 100644 --- a/docs/user-guide/api-mediation/observability/apiml-provided-observability-signals-and-attributes.md +++ b/docs/user-guide/api-mediation/observability/apiml-provided-observability-signals-and-attributes.md @@ -1,2 +1,16 @@ # API ML Provided Observability Signals and Attributes + + + + +## Custom Telemetry Template +Use this template when requesting or defining new custom metrics for the API ML: + +* **Signal Type**: (Metric / Trace / Log) +* **Name**: `zowe.apiml.[component].[functional_area]` +* **Description**: What does this signal represent? +* **Required Attributes**: + * `route.id`: Identifier of the routed service. + * `client.id`: (Optional) The ID of the consuming application. + * `zos.smf.id`: Automatically inherited from Resource. \ No newline at end of file diff --git a/docs/user-guide/api-mediation/observability/configuring-apiml-observability-via-opentelemetry.md b/docs/user-guide/api-mediation/observability/configuring-apiml-observability-via-opentelemetry.md index 0862b6608e..01fba5bd2c 100644 --- a/docs/user-guide/api-mediation/observability/configuring-apiml-observability-via-opentelemetry.md +++ b/docs/user-guide/api-mediation/observability/configuring-apiml-observability-via-opentelemetry.md @@ -1,7 +1,6 @@ # Configuring API ML Observability via OpenTelemetry - -Observability of functionalities in the Zowe API Mediation Layer (API ML) can be provided through integration with OpenTelemetry (OTel). This integration enables API ML to produce observability data, including [metrics](https://opentelemetry.io/docs/concepts/signals/metrics/), [logs](https://opentelemetry.io/docs/concepts/signals/logs/), and [traces](https://opentelemetry.io/docs/concepts/signals/traces/), that describe runtime behavior, request processing, and service interactions. This observability data can be collected and exported to supported analysis tools, thereby making it possible for API ML users to monitor system activity, diagnose issues, and understand service behavior without requiring a specific observability vendor. +Observability of functionalities in the Zowe API Mediation Layer (API ML) can be provided through integration with OpenTelemetry (OTel). This integration enables API ML to produce observability data that describe runtime behavior, request processing, and service interactions. This observability data can be collected and exported to supported analysis tools, thereby making it possible for API ML users to monitor system activity, diagnose issues, and understand service behavior without requiring a specific observability vendor. :::info Required role: System administrator @@ -42,15 +41,66 @@ For details about z/OS Attributes, see [Configuring OpenTelemetry z/OS Attribute ## Telemetry Data Produced -Zowe API ML produces several categories of data out-of-the-box via OpenTelemetry. +The API ML produces a range of telemetry data. By default, the OpenTelemetry integration captures performance, health, and interaction data made available through resource attributes configured in your `zowe.yaml`. -### Out-of-the-box (Standard OTel) +## Telemetry Signal Categories + +Observability in the API ML is built on the interaction between Signals and Resource Attributes. A _signal_, defined as a discrete stream of telemetry data, is represented by any one of three types of telemetry data: + +* [Metrics](https://opentelemetry.io/docs/concepts/signals/metrics/) (performance tracking) +* [Traces](https://opentelemetry.io/docs/concepts/signals/traces/) (request journeys) +* [Logs](https://opentelemetry.io/docs/concepts/signals/logs/) (event records). + +Each of these types of telemetry data represent a specific category of observation from a system. Every signal is automatically created based on Resource Attributes. These attributes act as a common identity, whereby data is categorized into Service (logical identity), z/OS (system and hardware context), and Deployment (environment tier). This categorization approach ensures that all telemetry is "mainframe-aware" allowing administrators to filter, group, and correlate data across the entire Sysplex using standard observability tools. + +### Metrics (Runtime Behavior & Health) +Metrics provide numerical data used to track trends and trigger alerts. + +* **JVM & System Metrics** + * **`process.runtime.jvm.memory.usage`**: Current utilization of heap and non-heap memory. + * **`process.runtime.jvm.gc.duration`**: Time spent in Garbage Collection, critical for identifying "stop-the-world" pauses. + * **`system.cpu.utilization`**: CPU usage percentage for the process and the overall LPAR. + +* **Request Processing Metrics** + * **`apiml.request.count`**: A counter of all incoming requests, categorized by `http.method` and `http.status_code`. + * **`apiml.request.duration`**: A histogram measuring the total time spent within the Modulith for each request. + * **`apiml.active.requests`**: A gauge showing the current number of concurrent requests being processed. + +### Traces (Service Interactions) +Traces record the path of a request as it traverses the API ML. + +* **Gateway Spans**: Measures the entry point latency and the time taken to proxy the request to a backend service. +* **Authentication Spans**: Tracks the duration of security checks (e.g., SAF, JWT validation, or ZSS calls). +* **Discovery Spans**: Records the time taken to resolve a service ID to a specific physical URL. + +### Logs (System Events) +Logs provide the "why" behind errors or changes in state. + +* **Access Logs**: High-volume logs detailing every request, including the `traceId` for correlation with traces. +* **Security Logs**: Records of failed authentication attempts or unauthorized access to protected routes. +* **Lifecycle Logs**: Critical events such as service registration, heartbeat failures, or Modulith startup/shutdown. + +## Examples of Useability of Telemetry data in API ML + +How a system administrator interacts with this data depends on the visualization tool used (e.g., Grafana, Jaeger, or Broadcom WatchTower). + +### Example 1: High-Level Health Monitoring (Metrics) +A system administrator views a Grafana dashboard. The administrator notices a spike in **`apiml.request.errors`**. +* **The View**: A red line graph shows a sudden jump from 0% to 15% error rate. +* **The Insight**: By filtering the dashboard using the attribute **`zos.smf.id`**, the admin realizes the errors are only occurring on **LPAR1**, while **LPAR2** remains healthy. This suggests a local configuration or connectivity issue on a specific system rather than a global software bug. + + +### Example 2: Latency Troubleshooting (Traces) +A user reports that a specific API is "timing out." The admin finds the relevant **`traceId`** in the logs and opens it in a trace viewer. +* **The View**: A "Gantt chart" style visualization of the request. +* **The Insight**: + * `apiml.gateway.total`: 2005ms + * `apiml.auth.check`: 5ms + * `apiml.backend.proxy`: 2000ms +* **The Action**: The admin sees that the Modulith itself only spent 5ms on logic, but waited 2 seconds for the backend mainframe service to respond. The admin can now confidently contact the specific backend service team. -* **JVM Metrics:** Memory usage (heap/non-heap), Garbage Collection (GC) frequency and duration, thread counts, and class loading. -* **System Metrics:** CPU usage (System vs. Process) and File Descriptor usage. -* **HTTP Metrics:** Request latency, throughput, and error rates (4xx/5xx) for all API traffic passing through the Modulith. From 66aa80d2333f830a959d4356ed254cb8d76e31b5 Mon Sep 17 00:00:00 2001 From: Andrew Jandacek Date: Fri, 23 Jan 2026 12:21:55 +0100 Subject: [PATCH 16/46] add deployment attributes info Signed-off-by: Andrew Jandacek --- .../configuring-otel-deployment-attributes.md | 26 ++++++++++++++++++- 1 file changed, 25 insertions(+), 1 deletion(-) diff --git a/docs/user-guide/api-mediation/observability/configuring-otel-deployment-attributes.md b/docs/user-guide/api-mediation/observability/configuring-otel-deployment-attributes.md index 0bf21ffe00..1f9bcaab9e 100644 --- a/docs/user-guide/api-mediation/observability/configuring-otel-deployment-attributes.md +++ b/docs/user-guide/api-mediation/observability/configuring-otel-deployment-attributes.md @@ -1,2 +1,26 @@ -# Configure OpenTelemetry Deployment Attributes +# Configuring OpenTelemetry Deployment Attributes +To configure deployment-specific resource attributes for the Zowe API ML. These attributes allow you to categorize telemetry data based on the lifecycle stage of the application, such as distinguishing between production, staging, or development environments. + +Unlike z/OS attributes which are often discovered automatically, deployment attributes are strictly informative and are typically defined manually. These attributes do not affect the unique identity of the service but are essential for filtering and grouping data within your observability backend. By explicitly labeling your environment, you ensure that performance anomalies in a test environment do not trigger false alerts in production monitoring views. + +## Deployment Attribute Reference + +The following attribute is used to describe the deployment of the single-service deployment of API ML: + +* **deployment.environment.name** + Specifies the name of the deployment environment (Example: dev, test, staging, or production). Configuration Source: zowe.yaml + +## Configuration Example in zowe.yaml + +To set the deployment environment, add the `deployment.environment.name` key to the `resource.attributes` section of your zowe.yaml file. + +``` +zowe: + observability: + enabled: true + resource: + attributes: + # Deployment Attribute (Manual Entry) + deployment.environment.name: "production" +``` \ No newline at end of file From bff2796ed6e2ca522cac20e4222fde0ff9aadc94 Mon Sep 17 00:00:00 2001 From: Andrew Jandacek Date: Fri, 23 Jan 2026 12:24:11 +0100 Subject: [PATCH 17/46] add comment Signed-off-by: Andrew Jandacek --- .../apiml-provided-observability-signals-and-attributes.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/user-guide/api-mediation/observability/apiml-provided-observability-signals-and-attributes.md b/docs/user-guide/api-mediation/observability/apiml-provided-observability-signals-and-attributes.md index 97b956631b..a734f23f1a 100644 --- a/docs/user-guide/api-mediation/observability/apiml-provided-observability-signals-and-attributes.md +++ b/docs/user-guide/api-mediation/observability/apiml-provided-observability-signals-and-attributes.md @@ -1,6 +1,6 @@ # API ML Provided Observability Signals and Attributes - +**TODO: Dev to provide Actual Signals and Attributes** From 89c6c2d6613f3d9c3887c5f49a0173a71131d10f Mon Sep 17 00:00:00 2001 From: Andrew Jandacek Date: Mon, 26 Jan 2026 10:11:53 +0100 Subject: [PATCH 18/46] Update enabling-observability-in-zowe.yaml.md --- .../observability/enabling-observability-in-zowe.yaml.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/user-guide/api-mediation/observability/enabling-observability-in-zowe.yaml.md b/docs/user-guide/api-mediation/observability/enabling-observability-in-zowe.yaml.md index fed7b26486..f445dd9e97 100644 --- a/docs/user-guide/api-mediation/observability/enabling-observability-in-zowe.yaml.md +++ b/docs/user-guide/api-mediation/observability/enabling-observability-in-zowe.yaml.md @@ -7,10 +7,10 @@ Review how to enable and configure the OpenTelemetry (OTel) integration within t The observability configuration is located under the API Mediation Layer `component` section of the zowe.yaml, under which there are three observability properties: * **enabled** - Activates the OTel SDK. Set to `true` to initialize the OpenTelemetry SDK. + Activates the OTel SDK. Set to `true` to initialize the OpenTelemetry SDK. * **exporter** -Defines where the data is sent. +Defines where the data is sent. Sub-properties of `exporter` include the following: * **exporter.otlp.protocol** The URL of your OTLP-compatible collector (e.g., z-Iris or Jaeger) From 98fdae42ce8e56f29ebbb0e66b6171aa04431688 Mon Sep 17 00:00:00 2001 From: Andrew Jandacek Date: Mon, 26 Jan 2026 10:35:15 +0100 Subject: [PATCH 19/46] refactor intro and move examples to Using OTel Signed-off-by: Andrew Jandacek --- ...g-apiml-observability-via-opentelemetry.md | 86 ++++++++++--------- .../observability/using-your-otel-metrics.md | 20 +++++ 2 files changed, 64 insertions(+), 42 deletions(-) diff --git a/docs/user-guide/api-mediation/observability/configuring-apiml-observability-via-opentelemetry.md b/docs/user-guide/api-mediation/observability/configuring-apiml-observability-via-opentelemetry.md index 01fba5bd2c..ce4bf24ad5 100644 --- a/docs/user-guide/api-mediation/observability/configuring-apiml-observability-via-opentelemetry.md +++ b/docs/user-guide/api-mediation/observability/configuring-apiml-observability-via-opentelemetry.md @@ -1,14 +1,12 @@ # Configuring API ML Observability via OpenTelemetry -Observability of functionalities in the Zowe API Mediation Layer (API ML) can be provided through integration with OpenTelemetry (OTel). This integration enables API ML to produce observability data that describe runtime behavior, request processing, and service interactions. This observability data can be collected and exported to supported analysis tools, thereby making it possible for API ML users to monitor system activity, diagnose issues, and understand service behavior without requiring a specific observability vendor. +Enable observability of functionalities in the Zowe API Mediation Layer (API ML) through integration with OpenTelemetry (OTel). This integration enables API ML to produce observability data that describe runtime behavior, request processing, and service interactions. :::info Required role: System administrator ::: -Observability can be enabled and configured using API ML and OpenTelemetry settings in the zowe.yaml file. You can also specify where data is exported. - -By leveraging the OpenTelemetry (OTel) standard, API ML allows system administrators to monitor performance, diagnose latency issues, and understand resource utilization within a mainframe environment using industry-standard tools like Prometheus, Grafana, or Jaeger. +API ML observability uses the OpenTelemetry (OTel) standard to enable system administrators to monitor performance, diagnose latency issues, and understand resource utilization within a mainframe environment using industry-standard tools like Prometheus, Grafana, or Jaeger. These anaysis tools make it possible for API ML users to monitor system activity, diagnose issues, and understand service behavior without requiring a specific observability vendor. :::note Observability features are available exclusively for the API ML single-service deployment. These features are not supported in the legacy microservice-based architecture of API ML. @@ -16,11 +14,9 @@ Observability features are available exclusively for the API ML single-service d ## Resource Attributes -In OpenTelemetry, a **Resource** represents the entity producing telemetry. For Zowe, this is the API ML single-service instance. Every signal (metric/trace/log) produced carries a set of attributes that identify a specific instance. - -OpenTelemetry resource attributes for the Zowe API ML are organized into three logical groups: Service, Deployment, and z/OS. +A **Resource** In OpenTelemetry represents the entity producing telemetry. For Zowe, this is the API ML single-service instance. Every _signal_ (metric/trace/log) produced carries a set of attributes that identify a specific instance. -This categorization follows the [OpenTelemetry Semantic Conventions](https://opentelemetry.io/docs/specs/semconv/resource/) to ensure that the telemetry produced by Zowe is consistent with industry standards and easily consumable by monitoring backends. +OpenTelemetry resource attributes for the Zowe API ML are organized into three logical groups of attributes: Service, Deployment, and z/OS. This categorization follows the [OpenTelemetry Semantic Conventions](https://opentelemetry.io/docs/specs/semconv/resource/) to ensure that the telemetry produced by Zowe is consistent with industry standards and easily consumable by monitoring backends. ### Attribute Categories @@ -41,7 +37,7 @@ For details about z/OS Attributes, see [Configuring OpenTelemetry z/OS Attribute ## Telemetry Data Produced -The API ML produces a range of telemetry data. By default, the OpenTelemetry integration captures performance, health, and interaction data made available through resource attributes configured in your `zowe.yaml`. +The API ML produces a range of telemetry data. By default, the OpenTelemetry integration captures performance, health, and interaction data made available through resource attributes configured in your `zowe.yaml`. You can also specify where data is exported. ## Telemetry Signal Categories @@ -51,54 +47,60 @@ Observability in the API ML is built on the interaction between Signals and Reso * [Traces](https://opentelemetry.io/docs/concepts/signals/traces/) (request journeys) * [Logs](https://opentelemetry.io/docs/concepts/signals/logs/) (event records). -Each of these types of telemetry data represent a specific category of observation from a system. Every signal is automatically created based on Resource Attributes. These attributes act as a common identity, whereby data is categorized into Service (logical identity), z/OS (system and hardware context), and Deployment (environment tier). This categorization approach ensures that all telemetry is "mainframe-aware" allowing administrators to filter, group, and correlate data across the entire Sysplex using standard observability tools. +Each of these types of telemetry data represent a specific category of observation from a system. Every signal is automatically created based on the previously described Resource Attributes. These attributes act as a common identity, whereby data is categorized into Service (logical identity), Deployment (environment tier), and z/OS (system and hardware context). This categorization approach ensures that all telemetry is "mainframe-aware" allowing administrators to filter, group, and correlate data across the entire Sysplex using standard observability tools. ### Metrics (Runtime Behavior & Health) Metrics provide numerical data used to track trends and trigger alerts. -* **JVM & System Metrics** - * **`process.runtime.jvm.memory.usage`**: Current utilization of heap and non-heap memory. - * **`process.runtime.jvm.gc.duration`**: Time spent in Garbage Collection, critical for identifying "stop-the-world" pauses. - * **`system.cpu.utilization`**: CPU usage percentage for the process and the overall LPAR. +* **JVM & System Metrics:** + * **process.runtime.jvm.memory.usage** + Current utilization of heap and non-heap memory. -* **Request Processing Metrics** - * **`apiml.request.count`**: A counter of all incoming requests, categorized by `http.method` and `http.status_code`. - * **`apiml.request.duration`**: A histogram measuring the total time spent within the Modulith for each request. - * **`apiml.active.requests`**: A gauge showing the current number of concurrent requests being processed. + * **process.runtime.jvm.gc.duration** + Time spent in Garbage Collection, critical for identifying critical pauses. -### Traces (Service Interactions) -Traces record the path of a request as it traverses the API ML. + * **system.cpu.utilization** + CPU usage percentage for the process and the overall LPAR. -* **Gateway Spans**: Measures the entry point latency and the time taken to proxy the request to a backend service. -* **Authentication Spans**: Tracks the duration of security checks (e.g., SAF, JWT validation, or ZSS calls). -* **Discovery Spans**: Records the time taken to resolve a service ID to a specific physical URL. +* **Request Processing Metrics:** + * **apiml.request.count** + A counter of all incoming requests, categorized by `http.method` and `http. + status_code`. -### Logs (System Events) -Logs provide the "why" behind errors or changes in state. + * **apiml.request.duration** + A histogram measuring the total time spent within the Modulith for each request. + + * **apiml.active.requests** + A gauge showing the current number of concurrent requests being processed. + +:::note +For examples of usability of OpenTelemetry metrics, see [Using your API ML OpenTelemetry metrics](using-your-otel-metrics.md). +::: + + +### Traces (Service Interactions) +Traces record the path of a request as it traverses the API ML. -* **Access Logs**: High-volume logs detailing every request, including the `traceId` for correlation with traces. -* **Security Logs**: Records of failed authentication attempts or unauthorized access to protected routes. -* **Lifecycle Logs**: Critical events such as service registration, heartbeat failures, or Modulith startup/shutdown. +* **Gateway Spans** +Measures the entry point latency and the time taken to proxy the request to a backend service. -## Examples of Useability of Telemetry data in API ML +* **Authentication Spans** +Tracks the duration of security checks (e.g., SAF, JWT validation, or ZSS calls). -How a system administrator interacts with this data depends on the visualization tool used (e.g., Grafana, Jaeger, or Broadcom WatchTower). +* **Discovery Spans** +Records the time taken to resolve a service ID to a specific physical URL. -### Example 1: High-Level Health Monitoring (Metrics) -A system administrator views a Grafana dashboard. The administrator notices a spike in **`apiml.request.errors`**. -* **The View**: A red line graph shows a sudden jump from 0% to 15% error rate. -* **The Insight**: By filtering the dashboard using the attribute **`zos.smf.id`**, the admin realizes the errors are only occurring on **LPAR1**, while **LPAR2** remains healthy. This suggests a local configuration or connectivity issue on a specific system rather than a global software bug. +### Logs (System Events) +Logs provide the "why" behind errors or changes in state. +* **Access Logs** +High-volume logs detailing every request, including the `traceId` for correlation with traces. -### Example 2: Latency Troubleshooting (Traces) -A user reports that a specific API is "timing out." The admin finds the relevant **`traceId`** in the logs and opens it in a trace viewer. -* **The View**: A "Gantt chart" style visualization of the request. -* **The Insight**: - * `apiml.gateway.total`: 2005ms - * `apiml.auth.check`: 5ms - * `apiml.backend.proxy`: 2000ms -* **The Action**: The admin sees that the Modulith itself only spent 5ms on logic, but waited 2 seconds for the backend mainframe service to respond. The admin can now confidently contact the specific backend service team. +* **Security Logs** +Records of failed authentication attempts or unauthorized access to protected routes. +* **Lifecycle Logs** +Critical events such as service registration, heartbeat failures, or Modulith startup/shutdown. diff --git a/docs/user-guide/api-mediation/observability/using-your-otel-metrics.md b/docs/user-guide/api-mediation/observability/using-your-otel-metrics.md index afe340349d..a555bbbd3a 100644 --- a/docs/user-guide/api-mediation/observability/using-your-otel-metrics.md +++ b/docs/user-guide/api-mediation/observability/using-your-otel-metrics.md @@ -1,2 +1,22 @@ # Using Your API ML OpenTelemetry Metrics +## Examples of Useability of Telemetry data in API ML + +How a system administrator interacts with this data depends on the visualization tool used (e.g., Grafana, Jaeger, or Broadcom WatchTower). + +### Example 1: High-Level Health Monitoring (Metrics) +A system administrator views a Grafana dashboard. The administrator notices a spike in **`apiml.request.errors`**. +* **The View**: A red line graph shows a sudden jump from 0% to 15% error rate. +* **The Insight**: By filtering the dashboard using the attribute **`zos.smf.id`**, the admin realizes the errors are only occurring on **LPAR1**, while **LPAR2** remains healthy. This suggests a local configuration or connectivity issue on a specific system rather than a global software bug. + + +### Example 2: Latency Troubleshooting (Traces) +A user reports that a specific API is "timing out." The admin finds the relevant **`traceId`** in the logs and opens it in a trace viewer. +* **The View**: A "Gantt chart" style visualization of the request. +* **The Insight**: + * `apiml.gateway.total`: 2005ms + * `apiml.auth.check`: 5ms + * `apiml.backend.proxy`: 2000ms +* **The Action**: The admin sees that the Modulith itself only spent 5ms on logic, but waited 2 seconds for the backend mainframe service to respond. The admin can now confidently contact the specific backend service team. + + From 80f1629b8db469aa7d365eefec5ad005f15dcabd Mon Sep 17 00:00:00 2001 From: Andrew Jandacek Date: Mon, 26 Jan 2026 15:49:06 +0100 Subject: [PATCH 20/46] add Otel install files to sidebars.js Signed-off-by: Andrew Jandacek --- ...yaml.md => enabling-observability-in-zowe-yaml.md} | 0 sidebars.js | 11 +++++++++++ 2 files changed, 11 insertions(+) rename docs/user-guide/api-mediation/observability/{enabling-observability-in-zowe.yaml.md => enabling-observability-in-zowe-yaml.md} (100%) diff --git a/docs/user-guide/api-mediation/observability/enabling-observability-in-zowe.yaml.md b/docs/user-guide/api-mediation/observability/enabling-observability-in-zowe-yaml.md similarity index 100% rename from docs/user-guide/api-mediation/observability/enabling-observability-in-zowe.yaml.md rename to docs/user-guide/api-mediation/observability/enabling-observability-in-zowe-yaml.md diff --git a/sidebars.js b/sidebars.js index c77fe6a95c..88195fa864 100644 --- a/sidebars.js +++ b/sidebars.js @@ -335,6 +335,17 @@ module.exports = { "extend/extend-apiml/api-mediation-redis" ] }, + { + "type": "category", + "label": "Configuring storage for the Caching service", + "link": { "type": "doc", "id": "user-guide/api-mediation/observability/configuring-apiml-observability-via-opentelemetry" }, + "items": [ + "user-guide/api-mediation/observability/configuring-otel-service-attributes", + "user-guide/api-mediation/observability/configuring-otel-deployment-attributes", + "user-guide/api-mediation/observability/configuring-otel-zos-attributes" + "user-guide/api-mediation/observability/enabling-observability-in-zowe-yaml" + ] + } "user-guide/api-mediation/configuration-customizing-the-api-catalog-ui", "user-guide/api-mediation/configuration-logging", "user-guide/api-mediation/wto-message-on-startup", From 76ef8646bdc034b040460e40290378e0e739100f Mon Sep 17 00:00:00 2001 From: Andrew Jandacek Date: Mon, 26 Jan 2026 15:57:32 +0100 Subject: [PATCH 21/46] add using OTel metrics and sub-topics to sidebar Signed-off-by: Andrew Jandacek --- sidebars.js | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/sidebars.js b/sidebars.js index 88195fa864..d3c5f27b7a 100644 --- a/sidebars.js +++ b/sidebars.js @@ -337,7 +337,7 @@ module.exports = { }, { "type": "category", - "label": "Configuring storage for the Caching service", + "label": "Configuring API ML Observability via OpenTelemetry", "link": { "type": "doc", "id": "user-guide/api-mediation/observability/configuring-apiml-observability-via-opentelemetry" }, "items": [ "user-guide/api-mediation/observability/configuring-otel-service-attributes", @@ -538,6 +538,14 @@ module.exports = { "user-guide/api-mediation-change-password-via-catalog", ], }, + { + type: "category", + label: "Using your API ML OpenTelemetry metrics", + items: [ + "user-guide/api-mediation/observability/apiml-provided-observability-signals-and-attributes", + "user-guide/api-mediation/observability/sample-output-from-apiml-otel", + ], + }, "user-guide/api-mediation/api-mediation-update-password", "user-guide/api-mediation/api-mediation-smf", ], From 1cc90f5dd2a55eb4e54146d89322145dec82063b Mon Sep 17 00:00:00 2001 From: Andrew Jandacek Date: Mon, 26 Jan 2026 16:03:33 +0100 Subject: [PATCH 22/46] fix syntax Signed-off-by: Andrew Jandacek --- sidebars.js | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sidebars.js b/sidebars.js index d3c5f27b7a..ec28462804 100644 --- a/sidebars.js +++ b/sidebars.js @@ -345,7 +345,7 @@ module.exports = { "user-guide/api-mediation/observability/configuring-otel-zos-attributes" "user-guide/api-mediation/observability/enabling-observability-in-zowe-yaml" ] - } + }, "user-guide/api-mediation/configuration-customizing-the-api-catalog-ui", "user-guide/api-mediation/configuration-logging", "user-guide/api-mediation/wto-message-on-startup", From 2b4459c388782ce9b324a9a40f01d9ef0e340b68 Mon Sep 17 00:00:00 2001 From: Andrew Jandacek Date: Mon, 26 Jan 2026 16:12:04 +0100 Subject: [PATCH 23/46] fix syntax Signed-off-by: Andrew Jandacek --- .../configuring-otel-service-attributes.md | 10 ++++++---- sidebars.js | 5 +++-- 2 files changed, 9 insertions(+), 6 deletions(-) diff --git a/docs/user-guide/api-mediation/observability/configuring-otel-service-attributes.md b/docs/user-guide/api-mediation/observability/configuring-otel-service-attributes.md index a465f50cdf..8ac4fc7a27 100644 --- a/docs/user-guide/api-mediation/observability/configuring-otel-service-attributes.md +++ b/docs/user-guide/api-mediation/observability/configuring-otel-service-attributes.md @@ -18,13 +18,13 @@ The following attributes are required to define the logical identity of the API * **service.name** (Required) Logical name of the service. Must be the same for all instances within the same HA deployment. Expected to be globally unique if `namespace` is not defined. -* **service.instance.id** (Required) +* **service.instance.id** (Required) Must be unique for each instance of `service.name` and `service.namespace` pair. Automatically generated UUID is generally recommended to ensure uniqueness. -* **service.namespace** (Required) +* **service.namespace** (Required) The assigned value should help distinguish a group of services, such as the LPAR, or owner team. `service.name` is expected to be unique within the same `namespace`. -* **service.version** (Required) +* **service.version** (Required) The exact version of the service artifact, typically a semantic version (e.g., 1.2.3) or a build hash, used to identify the specific software release. #### Configuration Example (`zowe.yaml`) @@ -36,5 +36,7 @@ zowe: attributes: service.name: "zowe-apiml" service.namespace: "mainframe-production" - # service.instance.id: "optional-custom-id" + service.instance.id: "optional-custom-id" + service.namespace: "optional-namespace" + service.version: "optional-version-number" diff --git a/sidebars.js b/sidebars.js index ec28462804..f79d11f64e 100644 --- a/sidebars.js +++ b/sidebars.js @@ -342,7 +342,7 @@ module.exports = { "items": [ "user-guide/api-mediation/observability/configuring-otel-service-attributes", "user-guide/api-mediation/observability/configuring-otel-deployment-attributes", - "user-guide/api-mediation/observability/configuring-otel-zos-attributes" + "user-guide/api-mediation/observability/configuring-otel-zos-attributes", "user-guide/api-mediation/observability/enabling-observability-in-zowe-yaml" ] }, @@ -541,9 +541,10 @@ module.exports = { { type: "category", label: "Using your API ML OpenTelemetry metrics", + link: { "type": "doc", "id": "user-guide/api-mediation/observability/configuring-apiml-observability-via-opentelemetry" }, items: [ "user-guide/api-mediation/observability/apiml-provided-observability-signals-and-attributes", - "user-guide/api-mediation/observability/sample-output-from-apiml-otel", + "user-guide/api-mediation/observability/sample-output-from-apiml-otel" ], }, "user-guide/api-mediation/api-mediation-update-password", From f67b2760306aff46fba7783946c0c7fb1929ae7d Mon Sep 17 00:00:00 2001 From: Andrew Jandacek Date: Tue, 27 Jan 2026 10:52:56 +0100 Subject: [PATCH 24/46] address comments Signed-off-by: Andrew Jandacek --- ...g-apiml-observability-via-opentelemetry.md | 12 +++---- .../configuring-otel-service-attributes.md | 36 ++++++++++++++++++- .../enabling-observability-in-zowe-yaml.md | 18 ++++++++-- 3 files changed, 55 insertions(+), 11 deletions(-) diff --git a/docs/user-guide/api-mediation/observability/configuring-apiml-observability-via-opentelemetry.md b/docs/user-guide/api-mediation/observability/configuring-apiml-observability-via-opentelemetry.md index ce4bf24ad5..9b093f3ad5 100644 --- a/docs/user-guide/api-mediation/observability/configuring-apiml-observability-via-opentelemetry.md +++ b/docs/user-guide/api-mediation/observability/configuring-apiml-observability-via-opentelemetry.md @@ -21,25 +21,23 @@ OpenTelemetry resource attributes for the Zowe API ML are organized into three l ### Attribute Categories * **Service Attributes** -These identify the logical entity producing the data. They are used to group telemetry from all instances of the API ML into a single "service" view in your monitoring tools. +These attributes define the logical identity of your application. The `service.name` allows you to group multiple instances into a single functional view (for example, "North-Region-APIML"). However, you can also use additional attributes like `service.instance.id` or `service.namespace` to distinguish between different installations or individual jobs. Configuring these sub-parameters ensures you can monitor the health of the entire API ecosystem while still being able to identify issues within a specific LPAR or geographic site. For details about Service Attributes, see [Configuring OpenTelemetry Service Attributes](configuring-otel-service-attributes.md). * **Deployment Attributes:** -These describe the lifecycle stage of the service. They allow you to filter telemetry data by environment (e.g., distinguishing production issues from test environment noise). +These attributes describe the lifecycle stage of the service. They allow you to filter telemetry data by environment (e.g., distinguishing production issues from test environment noise). For details about Deployment Attributes, see [Configuring OpenTelemetry Deployment Attributes](configuring-otel-deployment-attributes.md). * **z/OS Attributes** -These provide critical mainframe context. They identify the specific physical and logical environment (LPAR, Sysplex, and OS version) where the process is running, which is essential for mainframe-specific performance analysis. +These attributes provide critical mainframe context. They identify the specific physical and logical environment (LPAR, Sysplex, and OS version) where the process is running, which is essential for mainframe-specific performance analysis. For details about z/OS Attributes, see [Configuring OpenTelemetry z/OS Attributes](configuring-otel-zos-attributes.md). -## Telemetry Data Produced +## Telemetry Signals -The API ML produces a range of telemetry data. By default, the OpenTelemetry integration captures performance, health, and interaction data made available through resource attributes configured in your `zowe.yaml`. You can also specify where data is exported. - -## Telemetry Signal Categories +The API ML produces a range of telemetry data referred to as _signals_. By default, the OpenTelemetry integration captures performance, health, and interaction signals, which are enriched with the resource attributes configured in your zowe.yaml to provide environmental context. You can also specify where data is exported. Observability in the API ML is built on the interaction between Signals and Resource Attributes. A _signal_, defined as a discrete stream of telemetry data, is represented by any one of three types of telemetry data: diff --git a/docs/user-guide/api-mediation/observability/configuring-otel-service-attributes.md b/docs/user-guide/api-mediation/observability/configuring-otel-service-attributes.md index 8ac4fc7a27..ded40c2191 100644 --- a/docs/user-guide/api-mediation/observability/configuring-otel-service-attributes.md +++ b/docs/user-guide/api-mediation/observability/configuring-otel-service-attributes.md @@ -2,6 +2,40 @@ Services are identified via the `service.name` and `service.namespace` properties. These properties create a unique identity for API ML instances. +In complex enterprise environments, you likely have multiple API ML installations across different Sysplexes or data centers. To monitor these effectively, you must balance Logical Grouping (seeing all API ML traffic together) with Instance Differentiation (identifying exactly which installation is acting up). + +## The Hierarchy of Identification +To achieve this, OpenTelemetry uses a three-tier approach to service identity: + +service.name (The Group): Identifies the overall function. Use this to group all instances that perform the same role (e.g., acme-apiml-production). + +service.namespace (The Installation): Identifies a specific deployment or site. Use this to separate different installations, such as us-east-1 vs. us-west-1, or sysplex-a vs. sysplex-b. + +service.instance.id (The Individual): Identifies the specific process or Address Space. On z/OS, this is often mapped to the Job Name or ASID. + +**Example of Multi-Sysplex Deployment** +Imagine you have two API ML installations: one in Site 1 and one in Site 2. Each installation has two instances for high availability. + +| Attribute | Instance 1 (Site 1) | Instance 2 (Site 1) | Instance 3 (Site 2) | +| :--- | :--- | :--- | :--- | +| **service.name** | `zowe-apiml` | `zowe-apiml` | `zowe-apiml` | +| **service.namespace** | `site-1` | `site-1` | `site-2` | +| **service.instance.id** | `ZOWEAPIML1` | `ZOWEAPIML2` | `ZOWEAPIML3` | + +Example zowe.yaml: +``` +zowe: + observability: + enabled: true + resource: + attributes: + # The common logical group + service.name: "zowe-apiml" + # The specific installation or site + service.namespace: "site-2" + # The unique identifier for this specific job/process + service.instance.id: "ZOWEAPIML3" +``` **Naming Conventions:** Provide guidance on naming services (e.g., zowe-apiml) to ensure consistency across HA (High Availability) deployments. @@ -12,7 +46,7 @@ Services are identified via the `service.name` and `service.namespace` propertie ### Required Service Attributes -The following attributes are required to define the logical identity of the API ML. These attributes are automatically appended to all telemetry signals (metrics, traces, and logs) produced by the resource: +The following attributes are required to define the logical identity of the API ML. These attributes, as with all attributes, are automatically appended to all telemetry signals (metrics, traces, and logs) produced by the resource: * **service.name** (Required) diff --git a/docs/user-guide/api-mediation/observability/enabling-observability-in-zowe-yaml.md b/docs/user-guide/api-mediation/observability/enabling-observability-in-zowe-yaml.md index f445dd9e97..457bc825df 100644 --- a/docs/user-guide/api-mediation/observability/enabling-observability-in-zowe-yaml.md +++ b/docs/user-guide/api-mediation/observability/enabling-observability-in-zowe-yaml.md @@ -23,7 +23,16 @@ Defines where the data is sent. Sub-properties of `exporter` include the follow Defines the identity of the producer (Attributes). * **resource.attributes** - A collection of key-value pairs used to identify the telemetry source. + A collection of key-value pairs used to identify the telemetry source. See the following sub-properties of `resource.attributes`: + + * **service.name** + Logical name of the service. Must be the same for all instances within the same HA deployment. Expected to be globally unique if `namespace` is not defined. + + * **service.namespace** + The assigned value should help distinguish a group of services, such as the LPAR, or owner team. `service.name` is expected to be unique within the same `namespace`. + + * **deployment.environment.name** + Specifies the name of the deployment environment (Example: dev, test, staging, or production). Configuration Source: zowe.yaml To enable observability, configure the OpenTelemetry exporter and resource attributes within your `zowe.yaml` file with the following structure: @@ -45,8 +54,11 @@ zowe: ## How the Export Works -When `enabled: true` is set, the API ML single-service starts a background telemetry engine. This engine gathers internal metrics (like JVM heap or request latency) and bundles these metrics with the Resource Attributes defined in your config. These bundles are then pushed by means of the OTLP Exporter to your specified endpoint. +When `enabled: true` is set, the API ML single-service starts a background telemetry engine. This engine gathers all signals, including JVM heap or request latency, and bundles these signals with all Resource Attributes. These bundles are then pushed by means of the OTLP Exporter to your specified endpoint. :::note -If the endpoint is unreachable, API ML logs a warning, but service traffic is not interrupted. It is recommended to use a local OTel collector to minimize network latency. +If the endpoint is unreachable, API ML logs a warning, but service traffic is not interrupted. It is recommended to use a local OTel collector to minimize network latency. For information about the OTel collector, see [Quick start](https://opentelemetry.io/docs/collector/quick-start/) in the OpenTelemetry documentation. + +For the OTel official download, see [OpenTelemetry Collector Releases](https://github.com/open-telemetry/opentelemetry-collector-releases/releases) +For z/OS environments, you would typically look for the Linux on Z versions if running in a containerized environment, or check specific vendor distributions if running natively. ::: \ No newline at end of file From d71bd68875471ecb23c780efb837d7c6d629b44a Mon Sep 17 00:00:00 2001 From: Andrew Jandacek Date: Tue, 27 Jan 2026 10:57:47 +0100 Subject: [PATCH 25/46] add inline note to dev Signed-off-by: Andrew Jandacek --- .../configuring-apiml-observability-via-opentelemetry.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/docs/user-guide/api-mediation/observability/configuring-apiml-observability-via-opentelemetry.md b/docs/user-guide/api-mediation/observability/configuring-apiml-observability-via-opentelemetry.md index 9b093f3ad5..0343814c73 100644 --- a/docs/user-guide/api-mediation/observability/configuring-apiml-observability-via-opentelemetry.md +++ b/docs/user-guide/api-mediation/observability/configuring-apiml-observability-via-opentelemetry.md @@ -48,6 +48,8 @@ Observability in the API ML is built on the interaction between Signals and Reso Each of these types of telemetry data represent a specific category of observation from a system. Every signal is automatically created based on the previously described Resource Attributes. These attributes act as a common identity, whereby data is categorized into Service (logical identity), Deployment (environment tier), and z/OS (system and hardware context). This categorization approach ensures that all telemetry is "mainframe-aware" allowing administrators to filter, group, and correlate data across the entire Sysplex using standard observability tools. ### Metrics (Runtime Behavior & Health) + + Metrics provide numerical data used to track trends and trigger alerts. * **JVM & System Metrics:** @@ -79,6 +81,8 @@ For examples of usability of OpenTelemetry metrics, see [Using your API ML OpenT ### Traces (Service Interactions) Traces record the path of a request as it traverses the API ML. + + * **Gateway Spans** Measures the entry point latency and the time taken to proxy the request to a backend service. From 38d7ae4cf5c6e96b8a1110eaca5d72904e30d805 Mon Sep 17 00:00:00 2001 From: Andrew Jandacek Date: Tue, 27 Jan 2026 12:12:51 +0100 Subject: [PATCH 26/46] refactor descriptions Signed-off-by: Andrew Jandacek --- ...onfiguring-apiml-observability-via-opentelemetry.md | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/docs/user-guide/api-mediation/observability/configuring-apiml-observability-via-opentelemetry.md b/docs/user-guide/api-mediation/observability/configuring-apiml-observability-via-opentelemetry.md index 0343814c73..e7885c42d3 100644 --- a/docs/user-guide/api-mediation/observability/configuring-apiml-observability-via-opentelemetry.md +++ b/docs/user-guide/api-mediation/observability/configuring-apiml-observability-via-opentelemetry.md @@ -35,17 +35,19 @@ These attributes provide critical mainframe context. They identify the specific For details about z/OS Attributes, see [Configuring OpenTelemetry z/OS Attributes](configuring-otel-zos-attributes.md). -## Telemetry Signals +## Telemetry Signals and Observability -The API ML produces a range of telemetry data referred to as _signals_. By default, the OpenTelemetry integration captures performance, health, and interaction signals, which are enriched with the resource attributes configured in your zowe.yaml to provide environmental context. You can also specify where data is exported. +The API ML produces a range of telemetry data referred to as _signals_. A _signal_, defined as a discrete stream of telemetry data, is represented by any one of three types: metrics, traces, and logs, which are described in more detail in this section. By default, the OpenTelemetry integration captures performance, health, and interaction signals, which are enriched with the resource attributes configured in your zowe.yaml to provide environmental context. You can also specify where data is exported. Observability is achieved through the combination of telemetry signals, which quantify the real-time state and activity of the system, and resource attributes, which provide the structural labels necessary to organize and interpret those signals. -Observability in the API ML is built on the interaction between Signals and Resource Attributes. A _signal_, defined as a discrete stream of telemetry data, is represented by any one of three types of telemetry data: +Signals can be any of the following signal types: * [Metrics](https://opentelemetry.io/docs/concepts/signals/metrics/) (performance tracking) * [Traces](https://opentelemetry.io/docs/concepts/signals/traces/) (request journeys) * [Logs](https://opentelemetry.io/docs/concepts/signals/logs/) (event records). -Each of these types of telemetry data represent a specific category of observation from a system. Every signal is automatically created based on the previously described Resource Attributes. These attributes act as a common identity, whereby data is categorized into Service (logical identity), Deployment (environment tier), and z/OS (system and hardware context). This categorization approach ensures that all telemetry is "mainframe-aware" allowing administrators to filter, group, and correlate data across the entire Sysplex using standard observability tools. +Each of these signal types represent a specific category of observation from a system. Every signal is automatically enriched based on resource attributes. These attributes act as a common identity, whereby data is categorized into Service (logical identity), Deployment (environment tier), and z/OS (system and hardware context). This categorization approach ensures that all telemetry is "mainframe-aware" allowing administrators to filter, group, and correlate data across the entire Sysplex using standard observability tools. + +While these signals are enriched with mainframe-aware context when running on z/OS, API ML can also have full observability when deployed on other platforms such as Linux or within containerized environments. In these non-z/OS scenarios, the discovery engine automatically applies standard OpenTelemetry semantic conventions, capturing metadata like host names. This flexibility ensures that regardless of the underlying infrastructure, the telemetry signals remain consistent and actionable across your monitoring stack. ### Metrics (Runtime Behavior & Health) From e7cc6158865685aaa4f038b7ce741327a397d937 Mon Sep 17 00:00:00 2001 From: Andrew Jandacek Date: Tue, 27 Jan 2026 12:21:05 +0100 Subject: [PATCH 27/46] add info note for explanation Signed-off-by: Andrew Jandacek --- .../configuring-apiml-observability-via-opentelemetry.md | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/docs/user-guide/api-mediation/observability/configuring-apiml-observability-via-opentelemetry.md b/docs/user-guide/api-mediation/observability/configuring-apiml-observability-via-opentelemetry.md index e7885c42d3..2c71862f02 100644 --- a/docs/user-guide/api-mediation/observability/configuring-apiml-observability-via-opentelemetry.md +++ b/docs/user-guide/api-mediation/observability/configuring-apiml-observability-via-opentelemetry.md @@ -49,6 +49,15 @@ Each of these signal types represent a specific category of observation from a s While these signals are enriched with mainframe-aware context when running on z/OS, API ML can also have full observability when deployed on other platforms such as Linux or within containerized environments. In these non-z/OS scenarios, the discovery engine automatically applies standard OpenTelemetry semantic conventions, capturing metadata like host names. This flexibility ensures that regardless of the underlying infrastructure, the telemetry signals remain consistent and actionable across your monitoring stack. +:::info How to understand Signals vs Resources +To better understand the relationship between signals and resources, it is useful to consider this in the context of activity vs. organization: + +* The **Signal** provides an indicator of success or failure (e.g., response times, error counts). +* The **Resource Attributes** provide the means used to unlock that evidence (e.g., sorting by a specific LPAR or a specific production site). + +Taken together, the signal tells you that a problem exists, while the resource attributes allow you to isolate exactly where that problem is occurring within your infrastructure. +::: + ### Metrics (Runtime Behavior & Health) From 770f30f9d40728b97d393d23ec188ce44f7c5e02 Mon Sep 17 00:00:00 2001 From: Andrew Jandacek Date: Tue, 27 Jan 2026 13:25:00 +0100 Subject: [PATCH 28/46] add OTel link Signed-off-by: Andrew Jandacek --- .../configuring-apiml-observability-via-opentelemetry.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/user-guide/api-mediation/observability/configuring-apiml-observability-via-opentelemetry.md b/docs/user-guide/api-mediation/observability/configuring-apiml-observability-via-opentelemetry.md index 2c71862f02..2befee5ade 100644 --- a/docs/user-guide/api-mediation/observability/configuring-apiml-observability-via-opentelemetry.md +++ b/docs/user-guide/api-mediation/observability/configuring-apiml-observability-via-opentelemetry.md @@ -1,6 +1,6 @@ # Configuring API ML Observability via OpenTelemetry -Enable observability of functionalities in the Zowe API Mediation Layer (API ML) through integration with OpenTelemetry (OTel). This integration enables API ML to produce observability data that describe runtime behavior, request processing, and service interactions. +Enable observability of functionalities in the Zowe API Mediation Layer (API ML) through integration with [OpenTelemetry (OTel)](https://opentelemetry.io/). This integration enables API ML to produce observability data that describe runtime behavior, request processing, and service interactions. :::info Required role: System administrator From d90a03046928b02dbbc1fab7e5481f65cc1ea5f0 Mon Sep 17 00:00:00 2001 From: Andrew Jandacek Date: Tue, 27 Jan 2026 14:58:43 +0100 Subject: [PATCH 29/46] improve note Signed-off-by: Andrew Jandacek --- ...figuring-apiml-observability-via-opentelemetry.md | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/docs/user-guide/api-mediation/observability/configuring-apiml-observability-via-opentelemetry.md b/docs/user-guide/api-mediation/observability/configuring-apiml-observability-via-opentelemetry.md index 2befee5ade..225be87902 100644 --- a/docs/user-guide/api-mediation/observability/configuring-apiml-observability-via-opentelemetry.md +++ b/docs/user-guide/api-mediation/observability/configuring-apiml-observability-via-opentelemetry.md @@ -49,14 +49,14 @@ Each of these signal types represent a specific category of observation from a s While these signals are enriched with mainframe-aware context when running on z/OS, API ML can also have full observability when deployed on other platforms such as Linux or within containerized environments. In these non-z/OS scenarios, the discovery engine automatically applies standard OpenTelemetry semantic conventions, capturing metadata like host names. This flexibility ensures that regardless of the underlying infrastructure, the telemetry signals remain consistent and actionable across your monitoring stack. -:::info How to understand Signals vs Resources -To better understand the relationship between signals and resources, it is useful to consider this in the context of activity vs. organization: +:::info How to understand Signals vs Resources -* The **Signal** provides an indicator of success or failure (e.g., response times, error counts). -* The **Resource Attributes** provide the means used to unlock that evidence (e.g., sorting by a specific LPAR or a specific production site). +To better understand the relationship between signals and resources, it is useful to consider the analogy of a Shipping Package and its Label: -Taken together, the signal tells you that a problem exists, while the resource attributes allow you to isolate exactly where that problem is occurring within your infrastructure. -::: +* The **Signal** is the contents of the package. It contains the actual "goods"—the specific data about an event, such as a log message, a trace of a request, or a performance metric. +* The **Resource Attributes** are the shipping label fixed to the outside of the package. The label doesn't change the contents, but it tells you exactly where the package originated (e.g., the specific LPAR, Sysplex, or Service Name). + +Taken together, the Signal provides the evidence of what happened (the "what"), while the Resource Attributes provide the context of where it happened (the "where"). Without the label, the data is just a pile of anonymous packages; with the label, you can immediately sort and filter your data to isolate issues in specific parts of your infrastructure. ::: ### Metrics (Runtime Behavior & Health) From 1ffd218105b2804a50820e43bb562a9aeed702a0 Mon Sep 17 00:00:00 2001 From: Andrew Jandacek Date: Tue, 27 Jan 2026 15:15:25 +0100 Subject: [PATCH 30/46] fix syntax Signed-off-by: Andrew Jandacek --- .../configuring-apiml-observability-via-opentelemetry.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/docs/user-guide/api-mediation/observability/configuring-apiml-observability-via-opentelemetry.md b/docs/user-guide/api-mediation/observability/configuring-apiml-observability-via-opentelemetry.md index 225be87902..e58c8013e2 100644 --- a/docs/user-guide/api-mediation/observability/configuring-apiml-observability-via-opentelemetry.md +++ b/docs/user-guide/api-mediation/observability/configuring-apiml-observability-via-opentelemetry.md @@ -56,7 +56,9 @@ To better understand the relationship between signals and resources, it is usefu * The **Signal** is the contents of the package. It contains the actual "goods"—the specific data about an event, such as a log message, a trace of a request, or a performance metric. * The **Resource Attributes** are the shipping label fixed to the outside of the package. The label doesn't change the contents, but it tells you exactly where the package originated (e.g., the specific LPAR, Sysplex, or Service Name). -Taken together, the Signal provides the evidence of what happened (the "what"), while the Resource Attributes provide the context of where it happened (the "where"). Without the label, the data is just a pile of anonymous packages; with the label, you can immediately sort and filter your data to isolate issues in specific parts of your infrastructure. ::: +Taken together, the Signal provides the evidence of what happened (the "what"), while the Resource Attributes provide the context of where it happened (the "where"). Without the label, the data is just a pile of anonymous packages; with the label, you can immediately sort and filter your data to isolate issues in specific parts of your infrastructure. + +::: ### Metrics (Runtime Behavior & Health) From 4caee5c649ab3c780a3c4d92a32f2c925aff9404 Mon Sep 17 00:00:00 2001 From: Andrew Jandacek Date: Tue, 27 Jan 2026 15:24:04 +0100 Subject: [PATCH 31/46] add links to uis and fix formatting Signed-off-by: Andrew Jandacek --- .../configuring-apiml-observability-via-opentelemetry.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/user-guide/api-mediation/observability/configuring-apiml-observability-via-opentelemetry.md b/docs/user-guide/api-mediation/observability/configuring-apiml-observability-via-opentelemetry.md index e58c8013e2..1db2a8b8d1 100644 --- a/docs/user-guide/api-mediation/observability/configuring-apiml-observability-via-opentelemetry.md +++ b/docs/user-guide/api-mediation/observability/configuring-apiml-observability-via-opentelemetry.md @@ -6,7 +6,7 @@ Enable observability of functionalities in the Zowe API Mediation Layer (API ML) Required role: System administrator ::: -API ML observability uses the OpenTelemetry (OTel) standard to enable system administrators to monitor performance, diagnose latency issues, and understand resource utilization within a mainframe environment using industry-standard tools like Prometheus, Grafana, or Jaeger. These anaysis tools make it possible for API ML users to monitor system activity, diagnose issues, and understand service behavior without requiring a specific observability vendor. +API ML observability uses the OpenTelemetry (OTel) standard to enable system administrators to monitor performance, diagnose latency issues, and understand resource utilization within a mainframe environment using industry-standard tools like [Prometheus](https://prometheus.io/), [Grafana](https://grafana.io/), or [Jaeger](https://www.jaegertracing.io/). These anaysis tools make it possible for API ML users to monitor system activity, diagnose issues, and understand service behavior without requiring a specific observability vendor. :::note Observability features are available exclusively for the API ML single-service deployment. These features are not supported in the legacy microservice-based architecture of API ML. @@ -37,7 +37,7 @@ For details about z/OS Attributes, see [Configuring OpenTelemetry z/OS Attribute ## Telemetry Signals and Observability -The API ML produces a range of telemetry data referred to as _signals_. A _signal_, defined as a discrete stream of telemetry data, is represented by any one of three types: metrics, traces, and logs, which are described in more detail in this section. By default, the OpenTelemetry integration captures performance, health, and interaction signals, which are enriched with the resource attributes configured in your zowe.yaml to provide environmental context. You can also specify where data is exported. Observability is achieved through the combination of telemetry signals, which quantify the real-time state and activity of the system, and resource attributes, which provide the structural labels necessary to organize and interpret those signals. +The API ML produces a range of telemetry data referred to as _signals_. A signal, defined as a discrete stream of telemetry data, is represented by any one of three types: metrics, traces, and logs, which are described in more detail in this section. By default, the OpenTelemetry integration captures performance, health, and interaction signals, which are enriched with the resource attributes configured in your zowe.yaml to provide environmental context. You can also specify where data is exported. Observability is achieved through the combination of telemetry signals, which quantify the real-time state and activity of the system, and resource attributes, which provide the structural labels necessary to organize and interpret those signals. Signals can be any of the following signal types: @@ -54,7 +54,7 @@ While these signals are enriched with mainframe-aware context when running on z/ To better understand the relationship between signals and resources, it is useful to consider the analogy of a Shipping Package and its Label: * The **Signal** is the contents of the package. It contains the actual "goods"—the specific data about an event, such as a log message, a trace of a request, or a performance metric. -* The **Resource Attributes** are the shipping label fixed to the outside of the package. The label doesn't change the contents, but it tells you exactly where the package originated (e.g., the specific LPAR, Sysplex, or Service Name). +* The **Resource Attributes** are the shipping label fixed to the outside of the package. The label does not change the contents, but tells you exactly where the package originated (e.g., the specific LPAR, Sysplex, or Service Name). Taken together, the Signal provides the evidence of what happened (the "what"), while the Resource Attributes provide the context of where it happened (the "where"). Without the label, the data is just a pile of anonymous packages; with the label, you can immediately sort and filter your data to isolate issues in specific parts of your infrastructure. From daf14f12ba8341f379df5dcedeb694af902716e1 Mon Sep 17 00:00:00 2001 From: Andrew Jandacek Date: Tue, 27 Jan 2026 16:15:36 +0100 Subject: [PATCH 32/46] fix sidebar in using Signed-off-by: Andrew Jandacek --- sidebars.js | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sidebars.js b/sidebars.js index f79d11f64e..712d142792 100644 --- a/sidebars.js +++ b/sidebars.js @@ -541,7 +541,7 @@ module.exports = { { type: "category", label: "Using your API ML OpenTelemetry metrics", - link: { "type": "doc", "id": "user-guide/api-mediation/observability/configuring-apiml-observability-via-opentelemetry" }, + link: { "type": "doc", "id": "user-guide/api-mediation/observability/using-your-otel-metrics" }, items: [ "user-guide/api-mediation/observability/apiml-provided-observability-signals-and-attributes", "user-guide/api-mediation/observability/sample-output-from-apiml-otel" From bfed466919d20d0e53aa76ad8ef08a7ef782e339 Mon Sep 17 00:00:00 2001 From: Andrew Jandacek Date: Tue, 27 Jan 2026 16:47:09 +0100 Subject: [PATCH 33/46] improve service attribute topic Signed-off-by: Andrew Jandacek --- .../configuring-otel-service-attributes.md | 132 ++++++++++-------- 1 file changed, 76 insertions(+), 56 deletions(-) diff --git a/docs/user-guide/api-mediation/observability/configuring-otel-service-attributes.md b/docs/user-guide/api-mediation/observability/configuring-otel-service-attributes.md index ded40c2191..1756702389 100644 --- a/docs/user-guide/api-mediation/observability/configuring-otel-service-attributes.md +++ b/docs/user-guide/api-mediation/observability/configuring-otel-service-attributes.md @@ -1,76 +1,96 @@ # Configure OpenTelemetry Service Attributes -Services are identified via the `service.name` and `service.namespace` properties. These properties create a unique identity for API ML instances. +Services are identified via the service.name, service.namespace, and service.instance.id properties. Together, these attributes create a unique identity for API ML instances across your enterprise. -In complex enterprise environments, you likely have multiple API ML installations across different Sysplexes or data centers. To monitor these effectively, you must balance Logical Grouping (seeing all API ML traffic together) with Instance Differentiation (identifying exactly which installation is acting up). +In complex mainframe environments, you may have multiple API ML installations across different Sysplexes or data centers. To monitor these effectively, you must balance Logical Grouping (viewing all API ML traffic as one functional unit) with Instance Differentiation (identifying exactly which specific Address Space is experiencing an issue). ## The Hierarchy of Identification -To achieve this, OpenTelemetry uses a three-tier approach to service identity: +OpenTelemetry uses a three-tier approach to define service identity: -service.name (The Group): Identifies the overall function. Use this to group all instances that perform the same role (e.g., acme-apiml-production). +* **service.name** (The Service) +Identifies the logical name of the service. This property value should be identical for all instances across your entire organization that perform the same function (e.g., zowe-apiml). Expected to be globally unique if `namespace` is not defined. -service.namespace (The Installation): Identifies a specific deployment or site. Use this to separate different installations, such as us-east-1 vs. us-west-1, or sysplex-a vs. sysplex-b. +* **service.namespace** (The Environment/Site) +Groups services into logical sets. Use this property value to distinguish between different installations, such as sysplex-a vs. sysplex-b, or north-datacenter vs. south-datacenter. `service.name` is expected to be unique within the same `namespace`. -service.instance.id (The Individual): Identifies the specific process or Address Space. On z/OS, this is often mapped to the Job Name or ASID. +* **service.instance.id** (The Unique Instance) +Identifies the specific running process or Address Space. Must be unique for each instance of `service.name` and `service.namespace` pair. On z/OS, this property is typically mapped to the Job Name or a unique UUID. -**Example of Multi-Sysplex Deployment** -Imagine you have two API ML installations: one in Site 1 and one in Site 2. Each installation has two instances for high availability. + -| Attribute | Instance 1 (Site 1) | Instance 2 (Site 1) | Instance 3 (Site 2) | -| :--- | :--- | :--- | :--- | -| **service.name** | `zowe-apiml` | `zowe-apiml` | `zowe-apiml` | -| **service.namespace** | `site-1` | `site-1` | `site-2` | -| **service.instance.id** | `ZOWEAPIML1` | `ZOWEAPIML2` | `ZOWEAPIML3` | - -Example zowe.yaml: -``` -zowe: - observability: - enabled: true - resource: - attributes: - # The common logical group - service.name: "zowe-apiml" - # The specific installation or site - service.namespace: "site-2" - # The unique identifier for this specific job/process - service.instance.id: "ZOWEAPIML3" -``` +## Configuration Examples - -**Naming Conventions:** Provide guidance on naming services (e.g., zowe-apiml) to ensure consistency across HA (High Availability) deployments. +**Example 1: Single API ML Installation (High Availability)** -**Instance Tracking:** Describe the use of `service.instance.id` and how to ensure uniqueness across instances. +In this scenario, both instances share the same namespace because they belong to the same logical cluster on the same Sysplex. - +| Attribute | Instance 1 | Instance 2 | +| :--- | :--- | :--- | +| **service.name** | `zowe-apiml` | `zowe-apiml` | +| **service.namespace** | `production-plex` | `production-plex` | +| **service.instance.id** | `APIML01` | `APIML02` | -### Required Service Attributes - -The following attributes are required to define the logical identity of the API ML. These attributes, as with all attributes, are automatically appended to all telemetry signals (metrics, traces, and logs) produced by the resource: - - -* **service.name** (Required) -Logical name of the service. Must be the same for all instances within the same HA deployment. Expected to be globally unique if `namespace` is not defined. +**Instance 1 configuration** +``` +zowe: + components: + api-mediation-layer: + observability: + enabled: true + resource: + attributes: + service.name: "zowe-apiml" + service.namespace: "production-plex" + service.instance.id: "APIML01" +``` +**Instance 2 configuration** +``` +zowe: + components: + api-mediation-layer: + observability: + enabled: true + resource: + attributes: + service.name: "zowe-apiml" + service.namespace: "production-plex" + service.instance.id: "APIML02" +``` -* **service.instance.id** (Required) -Must be unique for each instance of `service.name` and `service.namespace` pair. Automatically generated UUID is generally recommended to ensure uniqueness. +## Example of Multi-Site Deployment -* **service.namespace** (Required) -The assigned value should help distinguish a group of services, such as the LPAR, or owner team. `service.name` is expected to be unique within the same `namespace`. +In this scenario, instances are separated by namespace to represent their physical data center locations. -* **service.version** (Required) -The exact version of the service artifact, typically a semantic version (e.g., 1.2.3) or a build hash, used to identify the specific software release. +| Attribute | Site 1 (Instance A) | Site 1 (Instance B) | Site 2 (Instance C) | +| :--- | :--- | :--- | :--- | +| **service.name** | `zowe-apiml` | `zowe-apiml` | `zowe-apiml` | +| **service.namespace** | `east-coast` | `east-coast` | `west-coast` | +| **service.instance.id** | `ZOWE-E1` | `ZOWE-E2` | `ZOWE-W1` | -#### Configuration Example (`zowe.yaml`) +**Site 1 (East Coast) Configuration:** -```yaml +``` zowe: - observability: - resource: - attributes: - service.name: "zowe-apiml" - service.namespace: "mainframe-production" - service.instance.id: "optional-custom-id" - service.namespace: "optional-namespace" - service.version: "optional-version-number" - + components: + api-mediation-layer: + observability: + enabled: true + resource: + attributes: + service.name: "zowe-apiml" + service.namespace: "east-coast" + service.instance.id: "ZOWE-E1" +``` +**Site 2 (West Coast) Configuration:** +``` +zowe: + components: + api-mediation-layer: + observability: + enabled: true + resource: + attributes: + service.name: "zowe-apiml" + service.namespace: "west-coast" + service.instance.id: "ZOWE-W1" +``` From 5f65bdee253e08675101ca318a632b9c24395403 Mon Sep 17 00:00:00 2001 From: Andrew Jandacek Date: Tue, 27 Jan 2026 17:03:41 +0100 Subject: [PATCH 34/46] improve zos attribute content Signed-off-by: Andrew Jandacek --- .../configuring-otel-zos-attributes.md | 22 ++++++++++++++----- 1 file changed, 16 insertions(+), 6 deletions(-) diff --git a/docs/user-guide/api-mediation/observability/configuring-otel-zos-attributes.md b/docs/user-guide/api-mediation/observability/configuring-otel-zos-attributes.md index c93a2bff75..b4f253ff7c 100644 --- a/docs/user-guide/api-mediation/observability/configuring-otel-zos-attributes.md +++ b/docs/user-guide/api-mediation/observability/configuring-otel-zos-attributes.md @@ -1,10 +1,20 @@ # Configure OpenTelemetry z/OS Attributes -z/OS-specific resource attributes for API ML provide essential mainframe context to your telemetry data, allowing you to correlate metrics and traces with specific system identifiers such as SMF IDs, Sysplex names, and LPARs. By providing z/OS platform context, mainframe performance data can be integrated into distributed observability backends. +z/OS-specific resource attributes for API ML provide essential mainframe context to your telemetry data, allowing you to correlate metrics, traces, and logs with specific system identifiers such as SMF IDs, Sysplex names, and LPARs. By providing z/OS platform context, mainframe performance data can be integrated into distributed observability backends. -The z/OS attributes are primarily populated through an automated System Discovery process. Upon the initialization of the single-service deployment of API ML, the integrated OpenTelemetry SDK executes platform-specific calls to query z/OS Control Blocks and system variables. This process identifies the current execution environment by retrieving values such as the Address Space Identifier (ASID), which is mapped to `process.pid`, and the system release level via `D IPLINFO` for `os.version`. If these attributes are already defined in the zowe.yaml configuration file, the discovery engine treats the manual entries as overrides, ensuring that user-defined values take precedence over detected system defaults. +## How system discovery works -These attributes provide environmental context specific to the IBM z/OS platform. +The z/OS attributes are primarily populated through an automated System Discovery process that occurs during the initialization of the API ML service. The integrated OpenTelemetry SDK executes platform-specific calls to query z/OS Control Blocks (such as the CVTSNAME or ECVT) and system variables. + +This process identifies the current execution environment by retrieving values such as: + +* **Address Space Identifier (ASID):** Mapped to `process.pid`. +* **System Release Level:** Retrieved via `D IPLINFO` and mapped to `os.version`. +* **Job Name:** Mapped to `process.command`. + +:::note +If these attributes are manually defined in the `zowe.yaml` configuration file, the discovery engine treats the manual entries as overrides, ensuring that user-defined values take precedence over detected system defaults. +::: ## z/OS Attribute Reference @@ -19,15 +29,15 @@ The name of the SYSPLEX to which the z/OS system belongs. Configuration Source: System discovery * **mainframe.lpar.name** -Name of the logical partition (LPAR) that hosts the z/OS system. +Name of the LPAR that hosts the z/OS system. Configuration Source: System discovery * **os.type** -The operating system type, set to zos. +The operating system type, set to `zos`. Configuration Source: Static * **os.version** -The version string of the operating system (e.g., the release returned by D IPLINFO). +The version string of the operating system (e.g., the release returned by `D IPLINFO`). Configuration Source: System discovery * **process.command** From 21c6c5283177be411f00009d70eb90a9796a0f33 Mon Sep 17 00:00:00 2001 From: Andrew Jandacek Date: Tue, 27 Jan 2026 17:04:38 +0100 Subject: [PATCH 35/46] updates to overview topic Signed-off-by: Andrew Jandacek --- .../configuring-apiml-observability-via-opentelemetry.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/user-guide/api-mediation/observability/configuring-apiml-observability-via-opentelemetry.md b/docs/user-guide/api-mediation/observability/configuring-apiml-observability-via-opentelemetry.md index 1db2a8b8d1..6113221672 100644 --- a/docs/user-guide/api-mediation/observability/configuring-apiml-observability-via-opentelemetry.md +++ b/docs/user-guide/api-mediation/observability/configuring-apiml-observability-via-opentelemetry.md @@ -14,14 +14,14 @@ Observability features are available exclusively for the API ML single-service d ## Resource Attributes -A **Resource** In OpenTelemetry represents the entity producing telemetry. For Zowe, this is the API ML single-service instance. Every _signal_ (metric/trace/log) produced carries a set of attributes that identify a specific instance. +A **Resource** In OpenTelemetry represents the entity producing telemetry. For Zowe, this is the API ML single-service instance. Every [signal](#telemetry-signals-and-observability) (metric/trace/log) produced carries a set of attributes that identify a specific instance. OpenTelemetry resource attributes for the Zowe API ML are organized into three logical groups of attributes: Service, Deployment, and z/OS. This categorization follows the [OpenTelemetry Semantic Conventions](https://opentelemetry.io/docs/specs/semconv/resource/) to ensure that the telemetry produced by Zowe is consistent with industry standards and easily consumable by monitoring backends. ### Attribute Categories * **Service Attributes** -These attributes define the logical identity of your application. The `service.name` allows you to group multiple instances into a single functional view (for example, "North-Region-APIML"). However, you can also use additional attributes like `service.instance.id` or `service.namespace` to distinguish between different installations or individual jobs. Configuring these sub-parameters ensures you can monitor the health of the entire API ecosystem while still being able to identify issues within a specific LPAR or geographic site. +These attributes define the logical identity of your application. The `service.name` allows you to group multiple instances into a single functional view (for example, "North-Region-APIML"). However, you can also use additional attributes like `service.instance.id` or `service.namespace` to distinguish between different installations or individual jobs. Configuring these sub-parameters allows you to monitor the health of the entire API ecosystem while still being able to identify issues within a specific LPAR or geographic site. For details about Service Attributes, see [Configuring OpenTelemetry Service Attributes](configuring-otel-service-attributes.md). From 3604057e5bbf9a07fba5d012d94fa60b479e6133 Mon Sep 17 00:00:00 2001 From: Andrew Jandacek Date: Wed, 28 Jan 2026 11:49:08 +0100 Subject: [PATCH 36/46] remove config-apiml-observability-via-opentelemetry.md file Signed-off-by: Andrew Jandacek --- ...g-apiml-observability-via-opentelemetry.md | 123 ------------------ 1 file changed, 123 deletions(-) delete mode 100644 docs/user-guide/api-mediation/observability/configuring-apiml-observability-via-opentelemetry.md diff --git a/docs/user-guide/api-mediation/observability/configuring-apiml-observability-via-opentelemetry.md b/docs/user-guide/api-mediation/observability/configuring-apiml-observability-via-opentelemetry.md deleted file mode 100644 index 6113221672..0000000000 --- a/docs/user-guide/api-mediation/observability/configuring-apiml-observability-via-opentelemetry.md +++ /dev/null @@ -1,123 +0,0 @@ -# Configuring API ML Observability via OpenTelemetry - -Enable observability of functionalities in the Zowe API Mediation Layer (API ML) through integration with [OpenTelemetry (OTel)](https://opentelemetry.io/). This integration enables API ML to produce observability data that describe runtime behavior, request processing, and service interactions. - -:::info -Required role: System administrator -::: - -API ML observability uses the OpenTelemetry (OTel) standard to enable system administrators to monitor performance, diagnose latency issues, and understand resource utilization within a mainframe environment using industry-standard tools like [Prometheus](https://prometheus.io/), [Grafana](https://grafana.io/), or [Jaeger](https://www.jaegertracing.io/). These anaysis tools make it possible for API ML users to monitor system activity, diagnose issues, and understand service behavior without requiring a specific observability vendor. - -:::note -Observability features are available exclusively for the API ML single-service deployment. These features are not supported in the legacy microservice-based architecture of API ML. -::: - -## Resource Attributes - -A **Resource** In OpenTelemetry represents the entity producing telemetry. For Zowe, this is the API ML single-service instance. Every [signal](#telemetry-signals-and-observability) (metric/trace/log) produced carries a set of attributes that identify a specific instance. - -OpenTelemetry resource attributes for the Zowe API ML are organized into three logical groups of attributes: Service, Deployment, and z/OS. This categorization follows the [OpenTelemetry Semantic Conventions](https://opentelemetry.io/docs/specs/semconv/resource/) to ensure that the telemetry produced by Zowe is consistent with industry standards and easily consumable by monitoring backends. - -### Attribute Categories - -* **Service Attributes** -These attributes define the logical identity of your application. The `service.name` allows you to group multiple instances into a single functional view (for example, "North-Region-APIML"). However, you can also use additional attributes like `service.instance.id` or `service.namespace` to distinguish between different installations or individual jobs. Configuring these sub-parameters allows you to monitor the health of the entire API ecosystem while still being able to identify issues within a specific LPAR or geographic site. - -For details about Service Attributes, see [Configuring OpenTelemetry Service Attributes](configuring-otel-service-attributes.md). - -* **Deployment Attributes:** -These attributes describe the lifecycle stage of the service. They allow you to filter telemetry data by environment (e.g., distinguishing production issues from test environment noise). - -For details about Deployment Attributes, see [Configuring OpenTelemetry Deployment Attributes](configuring-otel-deployment-attributes.md). - -* **z/OS Attributes** -These attributes provide critical mainframe context. They identify the specific physical and logical environment (LPAR, Sysplex, and OS version) where the process is running, which is essential for mainframe-specific performance analysis. - -For details about z/OS Attributes, see [Configuring OpenTelemetry z/OS Attributes](configuring-otel-zos-attributes.md). - -## Telemetry Signals and Observability - -The API ML produces a range of telemetry data referred to as _signals_. A signal, defined as a discrete stream of telemetry data, is represented by any one of three types: metrics, traces, and logs, which are described in more detail in this section. By default, the OpenTelemetry integration captures performance, health, and interaction signals, which are enriched with the resource attributes configured in your zowe.yaml to provide environmental context. You can also specify where data is exported. Observability is achieved through the combination of telemetry signals, which quantify the real-time state and activity of the system, and resource attributes, which provide the structural labels necessary to organize and interpret those signals. - -Signals can be any of the following signal types: - -* [Metrics](https://opentelemetry.io/docs/concepts/signals/metrics/) (performance tracking) -* [Traces](https://opentelemetry.io/docs/concepts/signals/traces/) (request journeys) -* [Logs](https://opentelemetry.io/docs/concepts/signals/logs/) (event records). - -Each of these signal types represent a specific category of observation from a system. Every signal is automatically enriched based on resource attributes. These attributes act as a common identity, whereby data is categorized into Service (logical identity), Deployment (environment tier), and z/OS (system and hardware context). This categorization approach ensures that all telemetry is "mainframe-aware" allowing administrators to filter, group, and correlate data across the entire Sysplex using standard observability tools. - -While these signals are enriched with mainframe-aware context when running on z/OS, API ML can also have full observability when deployed on other platforms such as Linux or within containerized environments. In these non-z/OS scenarios, the discovery engine automatically applies standard OpenTelemetry semantic conventions, capturing metadata like host names. This flexibility ensures that regardless of the underlying infrastructure, the telemetry signals remain consistent and actionable across your monitoring stack. - -:::info How to understand Signals vs Resources - -To better understand the relationship between signals and resources, it is useful to consider the analogy of a Shipping Package and its Label: - -* The **Signal** is the contents of the package. It contains the actual "goods"—the specific data about an event, such as a log message, a trace of a request, or a performance metric. -* The **Resource Attributes** are the shipping label fixed to the outside of the package. The label does not change the contents, but tells you exactly where the package originated (e.g., the specific LPAR, Sysplex, or Service Name). - -Taken together, the Signal provides the evidence of what happened (the "what"), while the Resource Attributes provide the context of where it happened (the "where"). Without the label, the data is just a pile of anonymous packages; with the label, you can immediately sort and filter your data to isolate issues in specific parts of your infrastructure. - -::: - -### Metrics (Runtime Behavior & Health) - - -Metrics provide numerical data used to track trends and trigger alerts. - -* **JVM & System Metrics:** - * **process.runtime.jvm.memory.usage** - Current utilization of heap and non-heap memory. - - * **process.runtime.jvm.gc.duration** - Time spent in Garbage Collection, critical for identifying critical pauses. - - * **system.cpu.utilization** - CPU usage percentage for the process and the overall LPAR. - -* **Request Processing Metrics:** - * **apiml.request.count** - A counter of all incoming requests, categorized by `http.method` and `http. - status_code`. - - * **apiml.request.duration** - A histogram measuring the total time spent within the Modulith for each request. - - * **apiml.active.requests** - A gauge showing the current number of concurrent requests being processed. - -:::note -For examples of usability of OpenTelemetry metrics, see [Using your API ML OpenTelemetry metrics](using-your-otel-metrics.md). -::: - - -### Traces (Service Interactions) -Traces record the path of a request as it traverses the API ML. - - - -* **Gateway Spans** -Measures the entry point latency and the time taken to proxy the request to a backend service. - -* **Authentication Spans** -Tracks the duration of security checks (e.g., SAF, JWT validation, or ZSS calls). - -* **Discovery Spans** -Records the time taken to resolve a service ID to a specific physical URL. - -### Logs (System Events) -Logs provide the "why" behind errors or changes in state. - -* **Access Logs** -High-volume logs detailing every request, including the `traceId` for correlation with traces. - -* **Security Logs** -Records of failed authentication attempts or unauthorized access to protected routes. - -* **Lifecycle Logs** -Critical events such as service registration, heartbeat failures, or Modulith startup/shutdown. - - - - - From ab193e51a45bdb0412a86e998c0fe2851f7b4f7d Mon Sep 17 00:00:00 2001 From: Andrew Jandacek Date: Wed, 28 Jan 2026 15:05:59 +0100 Subject: [PATCH 37/46] fix service.instance.id definition Signed-off-by: Andrew Jandacek --- .../observability/configuring-otel-service-attributes.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/user-guide/api-mediation/observability/configuring-otel-service-attributes.md b/docs/user-guide/api-mediation/observability/configuring-otel-service-attributes.md index 1756702389..5a11dac4f2 100644 --- a/docs/user-guide/api-mediation/observability/configuring-otel-service-attributes.md +++ b/docs/user-guide/api-mediation/observability/configuring-otel-service-attributes.md @@ -14,7 +14,7 @@ Identifies the logical name of the service. This property value should be identi Groups services into logical sets. Use this property value to distinguish between different installations, such as sysplex-a vs. sysplex-b, or north-datacenter vs. south-datacenter. `service.name` is expected to be unique within the same `namespace`. * **service.instance.id** (The Unique Instance) -Identifies the specific running process or Address Space. Must be unique for each instance of `service.name` and `service.namespace` pair. On z/OS, this property is typically mapped to the Job Name or a unique UUID. +Identifies a specific running process or Address Space. This value must be globally unique for every instance. As multiple z/OS systems can run identical Job Names, ensure that you combine the Job Name with a unique identifier (such as the LPAR name or a UUID) to ensure the instance can be isolated during troubleshooting. From a23961adca8366ca65e0a0f2bd166d476e813262 Mon Sep 17 00:00:00 2001 From: Andrew Jandacek Date: Wed, 28 Jan 2026 15:17:41 +0100 Subject: [PATCH 38/46] refactor from comments Signed-off-by: Andrew Jandacek --- .../observability/configuring-otel-zos-attributes.md | 12 ++---------- .../enabling-observability-in-zowe-yaml.md | 2 +- 2 files changed, 3 insertions(+), 11 deletions(-) diff --git a/docs/user-guide/api-mediation/observability/configuring-otel-zos-attributes.md b/docs/user-guide/api-mediation/observability/configuring-otel-zos-attributes.md index b4f253ff7c..87c1f1afef 100644 --- a/docs/user-guide/api-mediation/observability/configuring-otel-zos-attributes.md +++ b/docs/user-guide/api-mediation/observability/configuring-otel-zos-attributes.md @@ -1,21 +1,13 @@ # Configure OpenTelemetry z/OS Attributes + + z/OS-specific resource attributes for API ML provide essential mainframe context to your telemetry data, allowing you to correlate metrics, traces, and logs with specific system identifiers such as SMF IDs, Sysplex names, and LPARs. By providing z/OS platform context, mainframe performance data can be integrated into distributed observability backends. ## How system discovery works The z/OS attributes are primarily populated through an automated System Discovery process that occurs during the initialization of the API ML service. The integrated OpenTelemetry SDK executes platform-specific calls to query z/OS Control Blocks (such as the CVTSNAME or ECVT) and system variables. -This process identifies the current execution environment by retrieving values such as: - -* **Address Space Identifier (ASID):** Mapped to `process.pid`. -* **System Release Level:** Retrieved via `D IPLINFO` and mapped to `os.version`. -* **Job Name:** Mapped to `process.command`. - -:::note -If these attributes are manually defined in the `zowe.yaml` configuration file, the discovery engine treats the manual entries as overrides, ensuring that user-defined values take precedence over detected system defaults. -::: - ## z/OS Attribute Reference The following attributes are captured during system discovery to describe the mainframe environment: diff --git a/docs/user-guide/api-mediation/observability/enabling-observability-in-zowe-yaml.md b/docs/user-guide/api-mediation/observability/enabling-observability-in-zowe-yaml.md index 457bc825df..15adb5acbf 100644 --- a/docs/user-guide/api-mediation/observability/enabling-observability-in-zowe-yaml.md +++ b/docs/user-guide/api-mediation/observability/enabling-observability-in-zowe-yaml.md @@ -54,7 +54,7 @@ zowe: ## How the Export Works -When `enabled: true` is set, the API ML single-service starts a background telemetry engine. This engine gathers all signals, including JVM heap or request latency, and bundles these signals with all Resource Attributes. These bundles are then pushed by means of the OTLP Exporter to your specified endpoint. +When `enabled: true` is set, the API ML single-service starts a background telemetry engine. This engine gathers all signals and bundles these signals with all Resource Attributes. These bundles are then pushed by means of the OTLP Exporter to your specified endpoint. :::note If the endpoint is unreachable, API ML logs a warning, but service traffic is not interrupted. It is recommended to use a local OTel collector to minimize network latency. For information about the OTel collector, see [Quick start](https://opentelemetry.io/docs/collector/quick-start/) in the OpenTelemetry documentation. From 4065e1d681924843f50c5a03719de86bbee646cf Mon Sep 17 00:00:00 2001 From: Andrew Jandacek Date: Wed, 28 Jan 2026 15:20:41 +0100 Subject: [PATCH 39/46] remove note Signed-off-by: Andrew Jandacek --- .../observability/enabling-observability-in-zowe-yaml.md | 6 ------ 1 file changed, 6 deletions(-) diff --git a/docs/user-guide/api-mediation/observability/enabling-observability-in-zowe-yaml.md b/docs/user-guide/api-mediation/observability/enabling-observability-in-zowe-yaml.md index 15adb5acbf..997b485e52 100644 --- a/docs/user-guide/api-mediation/observability/enabling-observability-in-zowe-yaml.md +++ b/docs/user-guide/api-mediation/observability/enabling-observability-in-zowe-yaml.md @@ -56,9 +56,3 @@ zowe: When `enabled: true` is set, the API ML single-service starts a background telemetry engine. This engine gathers all signals and bundles these signals with all Resource Attributes. These bundles are then pushed by means of the OTLP Exporter to your specified endpoint. -:::note -If the endpoint is unreachable, API ML logs a warning, but service traffic is not interrupted. It is recommended to use a local OTel collector to minimize network latency. For information about the OTel collector, see [Quick start](https://opentelemetry.io/docs/collector/quick-start/) in the OpenTelemetry documentation. - -For the OTel official download, see [OpenTelemetry Collector Releases](https://github.com/open-telemetry/opentelemetry-collector-releases/releases) -For z/OS environments, you would typically look for the Linux on Z versions if running in a containerized environment, or check specific vendor distributions if running natively. -::: \ No newline at end of file From ab2dc2d719e810edcdb67d06e74e99feadc2e77c Mon Sep 17 00:00:00 2001 From: Andrew Jandacek Date: Wed, 28 Jan 2026 15:28:20 +0100 Subject: [PATCH 40/46] remove otel overview from sidebar to fix build Signed-off-by: Andrew Jandacek --- sidebars.js | 1 - 1 file changed, 1 deletion(-) diff --git a/sidebars.js b/sidebars.js index 712d142792..947f8f9cb0 100644 --- a/sidebars.js +++ b/sidebars.js @@ -338,7 +338,6 @@ module.exports = { { "type": "category", "label": "Configuring API ML Observability via OpenTelemetry", - "link": { "type": "doc", "id": "user-guide/api-mediation/observability/configuring-apiml-observability-via-opentelemetry" }, "items": [ "user-guide/api-mediation/observability/configuring-otel-service-attributes", "user-guide/api-mediation/observability/configuring-otel-deployment-attributes", From cee746c0c19ddc841a2e337573a05b40530908e6 Mon Sep 17 00:00:00 2001 From: Andrew Jandacek Date: Wed, 28 Jan 2026 15:40:14 +0100 Subject: [PATCH 41/46] remove broken link to overview article no longer in this PR Signed-off-by: Andrew Jandacek --- .../api-mediation/observability/observability-outline.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/user-guide/api-mediation/observability/observability-outline.md b/docs/user-guide/api-mediation/observability/observability-outline.md index 04a9407a56..78f136470c 100644 --- a/docs/user-guide/api-mediation/observability/observability-outline.md +++ b/docs/user-guide/api-mediation/observability/observability-outline.md @@ -2,7 +2,7 @@ The following files will be presented under Advanced server-side configuration under the **Install** tab: -* [Configuring API ML Observability via OpenTelemetry](configuring-apiml-observability-via-opentelemetry.md) +* Configuring API ML Observability via OpenTelemetry * [Configuring OpenTelemetry service attributes](configuring-otel-service-attributes.md) * [Configuring OpenTelemetry deployment attributes](configuring-otel-deployment-attributes.md) * [Configuring OpenTelemetry z/OS attributes](configuring-otel-zos-attributes.md) @@ -12,4 +12,4 @@ The following files will be presented under Using Zowe API Mediation Layer under * [Using your API ML OpenTelemetry metrics](using-your-otel-metrics.md) * [API ML Provided Observability Signals and Attributes](apiml-provided-observability-signals-and-attributes.md) - * [Sample Output from API ML OpenTelemetry](sample-output-from-apiml-otel.md) \ No newline at end of file + * [Sample Output from API ML OpenTelemetry](sample-output-from-apiml-otel.md) \ No newline at end of file From ce7dc2dc823559499c15f5c7df85e9d22a801552 Mon Sep 17 00:00:00 2001 From: Andrew Jandacek Date: Thu, 29 Jan 2026 11:21:40 +0100 Subject: [PATCH 42/46] improve how discovery service works description Signed-off-by: Andrew Jandacek --- .../configuring-otel-service-attributes.md | 2 +- .../observability/configuring-otel-zos-attributes.md | 12 ++++++++++++ 2 files changed, 13 insertions(+), 1 deletion(-) diff --git a/docs/user-guide/api-mediation/observability/configuring-otel-service-attributes.md b/docs/user-guide/api-mediation/observability/configuring-otel-service-attributes.md index 5a11dac4f2..8cf2b441ac 100644 --- a/docs/user-guide/api-mediation/observability/configuring-otel-service-attributes.md +++ b/docs/user-guide/api-mediation/observability/configuring-otel-service-attributes.md @@ -1,6 +1,6 @@ # Configure OpenTelemetry Service Attributes -Services are identified via the service.name, service.namespace, and service.instance.id properties. Together, these attributes create a unique identity for API ML instances across your enterprise. +Services are identified via the `service.name`, `service.namespace`, and `service.instance.id` properties. Together, these attributes create a unique identity for API ML instances across your enterprise. In complex mainframe environments, you may have multiple API ML installations across different Sysplexes or data centers. To monitor these effectively, you must balance Logical Grouping (viewing all API ML traffic as one functional unit) with Instance Differentiation (identifying exactly which specific Address Space is experiencing an issue). diff --git a/docs/user-guide/api-mediation/observability/configuring-otel-zos-attributes.md b/docs/user-guide/api-mediation/observability/configuring-otel-zos-attributes.md index 87c1f1afef..f4f400c880 100644 --- a/docs/user-guide/api-mediation/observability/configuring-otel-zos-attributes.md +++ b/docs/user-guide/api-mediation/observability/configuring-otel-zos-attributes.md @@ -6,6 +6,18 @@ z/OS-specific resource attributes for API ML provide essential mainframe context ## How system discovery works +System Discovery is the automated process by which API ML identifies its own physical and logical environment. Instead of requiring a system administrator to manually enter details for every instance, the software performs an internal "inventory" check at startup to populate its identity. + +The attributes are populated through a coordinated effort between the OpenTelemetry (OTel) SDK and the Zowe Discovery Service: + +* **The OTel SDK** (The "Gatherer") +As part of the API ML single-service instance, the SDK executes platform-specific calls at initialization. The SDK queries z/OS Control Blocks (memory structures used by the operating system, such as the CVT or ECVT) to retrieve the identity of the system, and also captures the Address Space ID (ASID) and Job Name. + +* **The Discovery Service** (The "Provider") +While the OTel SDK gathers low-level operating system data, the SDK queries the Zowe Discovery Service to retrieve and map specific service instance metadata, such as registered service IDs and status, directly into the OpenTelemetry resource attributes. This ensures that the identity reported in your telemetry matches the identity used for service registration and routing within API ML. + +By the time the API ML is ready to process its first request, the system discovery process has already enriched the service with its identity — the unique combination of service name, location, and z/OS system data that distinguishes this instance. This automation ensures every telemetry signal is accurately tagged with the following z/OS attributes without manual intervention: + The z/OS attributes are primarily populated through an automated System Discovery process that occurs during the initialization of the API ML service. The integrated OpenTelemetry SDK executes platform-specific calls to query z/OS Control Blocks (such as the CVTSNAME or ECVT) and system variables. ## z/OS Attribute Reference From 98ccea62a996c9234a435b067211b6fadf55a865 Mon Sep 17 00:00:00 2001 From: Andrew Jandacek Date: Mon, 2 Feb 2026 11:07:52 +0100 Subject: [PATCH 43/46] address Richard's comments Signed-off-by: Andrew Jandacek --- .../configuring-otel-deployment-attributes.md | 2 +- .../configuring-otel-zos-attributes.md | 2 +- .../enabling-observability-in-zowe-yaml.md | 10 ++++------ .../observability/using-your-otel-metrics.md | 17 +---------------- 4 files changed, 7 insertions(+), 24 deletions(-) diff --git a/docs/user-guide/api-mediation/observability/configuring-otel-deployment-attributes.md b/docs/user-guide/api-mediation/observability/configuring-otel-deployment-attributes.md index 1f9bcaab9e..c482d38f12 100644 --- a/docs/user-guide/api-mediation/observability/configuring-otel-deployment-attributes.md +++ b/docs/user-guide/api-mediation/observability/configuring-otel-deployment-attributes.md @@ -2,7 +2,7 @@ To configure deployment-specific resource attributes for the Zowe API ML. These attributes allow you to categorize telemetry data based on the lifecycle stage of the application, such as distinguishing between production, staging, or development environments. -Unlike z/OS attributes which are often discovered automatically, deployment attributes are strictly informative and are typically defined manually. These attributes do not affect the unique identity of the service but are essential for filtering and grouping data within your observability backend. By explicitly labeling your environment, you ensure that performance anomalies in a test environment do not trigger false alerts in production monitoring views. +While platform-specific attributes (like those for z/OS) focus on the execution environment and are often discovered automatically, deployment attributes are strictly informative and describe the logical purpose of the instance. Deployment attributes are defined manually and are universal across all platforms where API ML runs (z/OS, Linux, or Containers). These attributes do not affect the unique identity of the service but are essential for filtering and grouping data within your observability backend. By explicitly labeling your environment, you ensure that performance anomalies in a test environment do not trigger false alerts in production monitoring views. ## Deployment Attribute Reference diff --git a/docs/user-guide/api-mediation/observability/configuring-otel-zos-attributes.md b/docs/user-guide/api-mediation/observability/configuring-otel-zos-attributes.md index f4f400c880..5a4c870b4c 100644 --- a/docs/user-guide/api-mediation/observability/configuring-otel-zos-attributes.md +++ b/docs/user-guide/api-mediation/observability/configuring-otel-zos-attributes.md @@ -49,7 +49,7 @@ The command or JOB name used to launch the Zowe process. Configuration Source: System discovery * **process.pid** -The Process Identifier, which on z/OS is set to the Address Space Identifier (ASID). +The Process Identifier. For details about this property, see [Process Attributes](https://opentelemetry.io/docs/specs/semconv/registry/attributes/process/) in the OpenTelemetry documentation. Configuration Source: System discovery ## Overriding Discovered Attributes in zowe.yaml diff --git a/docs/user-guide/api-mediation/observability/enabling-observability-in-zowe-yaml.md b/docs/user-guide/api-mediation/observability/enabling-observability-in-zowe-yaml.md index 997b485e52..e289a22e4c 100644 --- a/docs/user-guide/api-mediation/observability/enabling-observability-in-zowe-yaml.md +++ b/docs/user-guide/api-mediation/observability/enabling-observability-in-zowe-yaml.md @@ -7,17 +7,15 @@ Review how to enable and configure the OpenTelemetry (OTel) integration within t The observability configuration is located under the API Mediation Layer `component` section of the zowe.yaml, under which there are three observability properties: * **enabled** - Activates the OTel SDK. Set to `true` to initialize the OpenTelemetry SDK. + Activates the OTel SDK. Set to `true` to initialize the OpenTelemetry SDK to enable observability. * **exporter** -Defines where the data is sent. Sub-properties of `exporter` include the following: +Defines where the data is sent. `exporter` has the following sub-property: * **exporter.otlp.protocol** - The URL of your OTLP-compatible collector (e.g., z-Iris or Jaeger) - - * **exporter.otlp.protocol** - The protocol is either `grpc` or `http/protobuf`. + The transport protocol used to transmit telemetry data. Options include `grpc` for high-performance streaming or `http/protobuf` for standard web compatibility. **Default:** `grcp` + * **resource** Defines the identity of the producer (Attributes). diff --git a/docs/user-guide/api-mediation/observability/using-your-otel-metrics.md b/docs/user-guide/api-mediation/observability/using-your-otel-metrics.md index a555bbbd3a..f8df6df94e 100644 --- a/docs/user-guide/api-mediation/observability/using-your-otel-metrics.md +++ b/docs/user-guide/api-mediation/observability/using-your-otel-metrics.md @@ -4,19 +4,4 @@ How a system administrator interacts with this data depends on the visualization tool used (e.g., Grafana, Jaeger, or Broadcom WatchTower). -### Example 1: High-Level Health Monitoring (Metrics) -A system administrator views a Grafana dashboard. The administrator notices a spike in **`apiml.request.errors`**. -* **The View**: A red line graph shows a sudden jump from 0% to 15% error rate. -* **The Insight**: By filtering the dashboard using the attribute **`zos.smf.id`**, the admin realizes the errors are only occurring on **LPAR1**, while **LPAR2** remains healthy. This suggests a local configuration or connectivity issue on a specific system rather than a global software bug. - - -### Example 2: Latency Troubleshooting (Traces) -A user reports that a specific API is "timing out." The admin finds the relevant **`traceId`** in the logs and opens it in a trace viewer. -* **The View**: A "Gantt chart" style visualization of the request. -* **The Insight**: - * `apiml.gateway.total`: 2005ms - * `apiml.auth.check`: 5ms - * `apiml.backend.proxy`: 2000ms -* **The Action**: The admin sees that the Modulith itself only spent 5ms on logic, but waited 2 seconds for the backend mainframe service to respond. The admin can now confidently contact the specific backend service team. - - + \ No newline at end of file From 6add9b6eb66afadd2026310c85c100fb0af31364 Mon Sep 17 00:00:00 2001 From: Andrew Jandacek Date: Mon, 2 Feb 2026 11:31:48 +0100 Subject: [PATCH 44/46] add validation procedure Signed-off-by: Andrew Jandacek --- .../enabling-observability-in-zowe-yaml.md | 30 +++++++++++++++---- 1 file changed, 24 insertions(+), 6 deletions(-) diff --git a/docs/user-guide/api-mediation/observability/enabling-observability-in-zowe-yaml.md b/docs/user-guide/api-mediation/observability/enabling-observability-in-zowe-yaml.md index e289a22e4c..de68629a56 100644 --- a/docs/user-guide/api-mediation/observability/enabling-observability-in-zowe-yaml.md +++ b/docs/user-guide/api-mediation/observability/enabling-observability-in-zowe-yaml.md @@ -10,12 +10,7 @@ The observability configuration is located under the API Mediation Layer `compon Activates the OTel SDK. Set to `true` to initialize the OpenTelemetry SDK to enable observability. * **exporter** -Defines where the data is sent. `exporter` has the following sub-property: - - * **exporter.otlp.protocol** - The transport protocol used to transmit telemetry data. Options include `grpc` for high-performance streaming or `http/protobuf` for standard web compatibility. - **Default:** `grcp` - +Defines where the data is sent. * **resource** Defines the identity of the producer (Attributes). @@ -54,3 +49,26 @@ zowe: When `enabled: true` is set, the API ML single-service starts a background telemetry engine. This engine gathers all signals and bundles these signals with all Resource Attributes. These bundles are then pushed by means of the OTLP Exporter to your specified endpoint. +## Validating the Configuration + +After applying the changes to zowe.yaml and restarting the API Mediation Layer, verify that the OpenTelemetry integration is active and communicating with your collector. + +1. Check the API ML Startup Logs. +Review the job logs for the API ML service. Upon successful initialization with observability enabled, look for messages indicating the OpenTelemetry SDK has started. + +To confirm successful initialization, review the log entries which confirm that the OTLP exporter has initialized and is attempting to connect to the specified endpoint. If the endpoint is unreachable or the protocol is mismatched, the logs will typically show Exporting failed or Connection refused messages from the OTel SDK. + +2. Verify Signal Reception in your Observability Tool. +The most definitive validation is to confirm that data is appearing in your chosen observability backend: + + a. Search by Service Name. + In your monitoring tool's UI, look for the value you defined in `service.name` (e.g., zowe-apiml). + + b. Filter by Namespace. + If you have multiple installations, use the `service.namespace` filter to isolate data from this specific instance. + +3. Confirm Attributes. +Select a trace or metric and verify that the Resource Attributes (such as `zos.smf.id` or `mainframe.lpar.name`) are correctly attached. + +4. Use the Collector's Logging (Optional). +If data is not appearing in the backend, check the logs of your OpenTelemetry Collector. If the collector is configured with the logging or debug exporter, you will see raw incoming "Export" requests from the API ML's IP address. \ No newline at end of file From ce359dcf9d1240a46ae28b9dc0751e92ed2499fa Mon Sep 17 00:00:00 2001 From: Andrew Jandacek Date: Mon, 2 Feb 2026 13:47:01 +0100 Subject: [PATCH 45/46] fix typo Signed-off-by: Andrew Jandacek --- .../api-mediation/observability/using-your-otel-metrics.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/user-guide/api-mediation/observability/using-your-otel-metrics.md b/docs/user-guide/api-mediation/observability/using-your-otel-metrics.md index f8df6df94e..416e712123 100644 --- a/docs/user-guide/api-mediation/observability/using-your-otel-metrics.md +++ b/docs/user-guide/api-mediation/observability/using-your-otel-metrics.md @@ -1,6 +1,6 @@ # Using Your API ML OpenTelemetry Metrics -## Examples of Useability of Telemetry data in API ML +## Examples of Usability of Telemetry data in API ML How a system administrator interacts with this data depends on the visualization tool used (e.g., Grafana, Jaeger, or Broadcom WatchTower). From d312a9b6d01acf27a30b088058038b9b533b6606 Mon Sep 17 00:00:00 2001 From: Andrew Jandacek Date: Mon, 2 Feb 2026 14:54:55 +0100 Subject: [PATCH 46/46] fix formatting Signed-off-by: Andrew Jandacek --- .../observability/enabling-observability-in-zowe-yaml.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/user-guide/api-mediation/observability/enabling-observability-in-zowe-yaml.md b/docs/user-guide/api-mediation/observability/enabling-observability-in-zowe-yaml.md index de68629a56..aa07e84f49 100644 --- a/docs/user-guide/api-mediation/observability/enabling-observability-in-zowe-yaml.md +++ b/docs/user-guide/api-mediation/observability/enabling-observability-in-zowe-yaml.md @@ -56,7 +56,7 @@ After applying the changes to zowe.yaml and restarting the API Mediation Layer, 1. Check the API ML Startup Logs. Review the job logs for the API ML service. Upon successful initialization with observability enabled, look for messages indicating the OpenTelemetry SDK has started. -To confirm successful initialization, review the log entries which confirm that the OTLP exporter has initialized and is attempting to connect to the specified endpoint. If the endpoint is unreachable or the protocol is mismatched, the logs will typically show Exporting failed or Connection refused messages from the OTel SDK. + To confirm successful initialization, review the log entries which confirm that the OTLP exporter has initialized and is attempting to connect to the specified endpoint. If the endpoint is unreachable or the protocol is mismatched, the logs will typically show Exporting failed or Connection refused messages from the OTel SDK. 2. Verify Signal Reception in your Observability Tool. The most definitive validation is to confirm that data is appearing in your chosen observability backend: