diff --git a/docs/user-guide/api-mediation/observability/apiml-provided-observability-signals-and-attributes.md b/docs/user-guide/api-mediation/observability/apiml-provided-observability-signals-and-attributes.md new file mode 100644 index 0000000000..a734f23f1a --- /dev/null +++ b/docs/user-guide/api-mediation/observability/apiml-provided-observability-signals-and-attributes.md @@ -0,0 +1,16 @@ +# API ML Provided Observability Signals and Attributes + +**TODO: Dev to provide Actual Signals and Attributes** + + + +## Custom Telemetry Template +Use this template when requesting or defining new custom metrics for the API ML: + +* **Signal Type**: (Metric / Trace / Log) +* **Name**: `zowe.apiml.[component].[functional_area]` +* **Description**: What does this signal represent? +* **Required Attributes**: + * `route.id`: Identifier of the routed service. + * `client.id`: (Optional) The ID of the consuming application. + * `zos.smf.id`: Automatically inherited from Resource. \ No newline at end of file diff --git a/docs/user-guide/api-mediation/observability/configuring-otel-deployment-attributes.md b/docs/user-guide/api-mediation/observability/configuring-otel-deployment-attributes.md new file mode 100644 index 0000000000..c482d38f12 --- /dev/null +++ b/docs/user-guide/api-mediation/observability/configuring-otel-deployment-attributes.md @@ -0,0 +1,26 @@ +# Configuring OpenTelemetry Deployment Attributes + +To configure deployment-specific resource attributes for the Zowe API ML. These attributes allow you to categorize telemetry data based on the lifecycle stage of the application, such as distinguishing between production, staging, or development environments. + +While platform-specific attributes (like those for z/OS) focus on the execution environment and are often discovered automatically, deployment attributes are strictly informative and describe the logical purpose of the instance. Deployment attributes are defined manually and are universal across all platforms where API ML runs (z/OS, Linux, or Containers). These attributes do not affect the unique identity of the service but are essential for filtering and grouping data within your observability backend. By explicitly labeling your environment, you ensure that performance anomalies in a test environment do not trigger false alerts in production monitoring views. + +## Deployment Attribute Reference + +The following attribute is used to describe the deployment of the single-service deployment of API ML: + +* **deployment.environment.name** + Specifies the name of the deployment environment (Example: dev, test, staging, or production). Configuration Source: zowe.yaml + +## Configuration Example in zowe.yaml + +To set the deployment environment, add the `deployment.environment.name` key to the `resource.attributes` section of your zowe.yaml file. + +``` +zowe: + observability: + enabled: true + resource: + attributes: + # Deployment Attribute (Manual Entry) + deployment.environment.name: "production" +``` \ No newline at end of file diff --git a/docs/user-guide/api-mediation/observability/configuring-otel-service-attributes.md b/docs/user-guide/api-mediation/observability/configuring-otel-service-attributes.md new file mode 100644 index 0000000000..8cf2b441ac --- /dev/null +++ b/docs/user-guide/api-mediation/observability/configuring-otel-service-attributes.md @@ -0,0 +1,96 @@ +# Configure OpenTelemetry Service Attributes + +Services are identified via the `service.name`, `service.namespace`, and `service.instance.id` properties. Together, these attributes create a unique identity for API ML instances across your enterprise. + +In complex mainframe environments, you may have multiple API ML installations across different Sysplexes or data centers. To monitor these effectively, you must balance Logical Grouping (viewing all API ML traffic as one functional unit) with Instance Differentiation (identifying exactly which specific Address Space is experiencing an issue). + +## The Hierarchy of Identification +OpenTelemetry uses a three-tier approach to define service identity: + +* **service.name** (The Service) +Identifies the logical name of the service. This property value should be identical for all instances across your entire organization that perform the same function (e.g., zowe-apiml). Expected to be globally unique if `namespace` is not defined. + +* **service.namespace** (The Environment/Site) +Groups services into logical sets. Use this property value to distinguish between different installations, such as sysplex-a vs. sysplex-b, or north-datacenter vs. south-datacenter. `service.name` is expected to be unique within the same `namespace`. + +* **service.instance.id** (The Unique Instance) +Identifies a specific running process or Address Space. This value must be globally unique for every instance. As multiple z/OS systems can run identical Job Names, ensure that you combine the Job Name with a unique identifier (such as the LPAR name or a UUID) to ensure the instance can be isolated during troubleshooting. + + + +## Configuration Examples + +**Example 1: Single API ML Installation (High Availability)** + +In this scenario, both instances share the same namespace because they belong to the same logical cluster on the same Sysplex. + +| Attribute | Instance 1 | Instance 2 | +| :--- | :--- | :--- | +| **service.name** | `zowe-apiml` | `zowe-apiml` | +| **service.namespace** | `production-plex` | `production-plex` | +| **service.instance.id** | `APIML01` | `APIML02` | + +**Instance 1 configuration** +``` +zowe: + components: + api-mediation-layer: + observability: + enabled: true + resource: + attributes: + service.name: "zowe-apiml" + service.namespace: "production-plex" + service.instance.id: "APIML01" +``` +**Instance 2 configuration** +``` +zowe: + components: + api-mediation-layer: + observability: + enabled: true + resource: + attributes: + service.name: "zowe-apiml" + service.namespace: "production-plex" + service.instance.id: "APIML02" +``` + +## Example of Multi-Site Deployment + +In this scenario, instances are separated by namespace to represent their physical data center locations. + +| Attribute | Site 1 (Instance A) | Site 1 (Instance B) | Site 2 (Instance C) | +| :--- | :--- | :--- | :--- | +| **service.name** | `zowe-apiml` | `zowe-apiml` | `zowe-apiml` | +| **service.namespace** | `east-coast` | `east-coast` | `west-coast` | +| **service.instance.id** | `ZOWE-E1` | `ZOWE-E2` | `ZOWE-W1` | + +**Site 1 (East Coast) Configuration:** + +``` +zowe: + components: + api-mediation-layer: + observability: + enabled: true + resource: + attributes: + service.name: "zowe-apiml" + service.namespace: "east-coast" + service.instance.id: "ZOWE-E1" +``` +**Site 2 (West Coast) Configuration:** +``` +zowe: + components: + api-mediation-layer: + observability: + enabled: true + resource: + attributes: + service.name: "zowe-apiml" + service.namespace: "west-coast" + service.instance.id: "ZOWE-W1" +``` diff --git a/docs/user-guide/api-mediation/observability/configuring-otel-zos-attributes.md b/docs/user-guide/api-mediation/observability/configuring-otel-zos-attributes.md new file mode 100644 index 0000000000..5a4c870b4c --- /dev/null +++ b/docs/user-guide/api-mediation/observability/configuring-otel-zos-attributes.md @@ -0,0 +1,69 @@ +# Configure OpenTelemetry z/OS Attributes + + + +z/OS-specific resource attributes for API ML provide essential mainframe context to your telemetry data, allowing you to correlate metrics, traces, and logs with specific system identifiers such as SMF IDs, Sysplex names, and LPARs. By providing z/OS platform context, mainframe performance data can be integrated into distributed observability backends. + +## How system discovery works + +System Discovery is the automated process by which API ML identifies its own physical and logical environment. Instead of requiring a system administrator to manually enter details for every instance, the software performs an internal "inventory" check at startup to populate its identity. + +The attributes are populated through a coordinated effort between the OpenTelemetry (OTel) SDK and the Zowe Discovery Service: + +* **The OTel SDK** (The "Gatherer") +As part of the API ML single-service instance, the SDK executes platform-specific calls at initialization. The SDK queries z/OS Control Blocks (memory structures used by the operating system, such as the CVT or ECVT) to retrieve the identity of the system, and also captures the Address Space ID (ASID) and Job Name. + +* **The Discovery Service** (The "Provider") +While the OTel SDK gathers low-level operating system data, the SDK queries the Zowe Discovery Service to retrieve and map specific service instance metadata, such as registered service IDs and status, directly into the OpenTelemetry resource attributes. This ensures that the identity reported in your telemetry matches the identity used for service registration and routing within API ML. + +By the time the API ML is ready to process its first request, the system discovery process has already enriched the service with its identity — the unique combination of service name, location, and z/OS system data that distinguishes this instance. This automation ensures every telemetry signal is accurately tagged with the following z/OS attributes without manual intervention: + +The z/OS attributes are primarily populated through an automated System Discovery process that occurs during the initialization of the API ML service. The integrated OpenTelemetry SDK executes platform-specific calls to query z/OS Control Blocks (such as the CVTSNAME or ECVT) and system variables. + +## z/OS Attribute Reference + +The following attributes are captured during system discovery to describe the mainframe environment: + +* **zos.smf.id** +The System Management Facility (SMF) Identifier that uniquely identifies a z/OS system within a SYSPLEX. +Configuration Source: System discovery + +* **zos.sysplex.name** +The name of the SYSPLEX to which the z/OS system belongs. +Configuration Source: System discovery + +* **mainframe.lpar.name** +Name of the LPAR that hosts the z/OS system. +Configuration Source: System discovery + +* **os.type** +The operating system type, set to `zos`. +Configuration Source: Static + +* **os.version** +The version string of the operating system (e.g., the release returned by `D IPLINFO`). +Configuration Source: System discovery + +* **process.command** +The command or JOB name used to launch the Zowe process. +Configuration Source: System discovery + +* **process.pid** +The Process Identifier. For details about this property, see [Process Attributes](https://opentelemetry.io/docs/specs/semconv/registry/attributes/process/) in the OpenTelemetry documentation. +Configuration Source: System discovery + +## Overriding Discovered Attributes in zowe.yaml + +While the discovery process handles most identifiers automatically, you may occasionally need to provide a manual override (for example, in shared environments where you wish to report a custom logical LPAR name). This is performed in the `resource.attributes` section of your zowe.yaml: + +``` +zowe: + observability: + enabled: true + resource: + attributes: + # Overriding discovered z/OS attributes + zos.smf.id: "MVS1" + zos.sysplex.name: "LOCALPLX" + mainframe.lpar.name: "PRODLPAR" +``` diff --git a/docs/user-guide/api-mediation/observability/enabling-observability-in-zowe-yaml.md b/docs/user-guide/api-mediation/observability/enabling-observability-in-zowe-yaml.md new file mode 100644 index 0000000000..aa07e84f49 --- /dev/null +++ b/docs/user-guide/api-mediation/observability/enabling-observability-in-zowe-yaml.md @@ -0,0 +1,74 @@ +# Enabling API ML Observability in zowe.yaml + +Review how to enable and configure the OpenTelemetry (OTel) integration within the Zowe API Mediation Layer (API ML) single-service deployment. Configure these parameters in `zowe.yaml` to enable API ML to export metrics, traces, and logs to an OpenTelemetry Collector. + +## Configuration Overview + +The observability configuration is located under the API Mediation Layer `component` section of the zowe.yaml, under which there are three observability properties: + +* **enabled** + Activates the OTel SDK. Set to `true` to initialize the OpenTelemetry SDK to enable observability. + +* **exporter** +Defines where the data is sent. + +* **resource** +Defines the identity of the producer (Attributes). + + * **resource.attributes** + A collection of key-value pairs used to identify the telemetry source. See the following sub-properties of `resource.attributes`: + + * **service.name** + Logical name of the service. Must be the same for all instances within the same HA deployment. Expected to be globally unique if `namespace` is not defined. + + * **service.namespace** + The assigned value should help distinguish a group of services, such as the LPAR, or owner team. `service.name` is expected to be unique within the same `namespace`. + + * **deployment.environment.name** + Specifies the name of the deployment environment (Example: dev, test, staging, or production). Configuration Source: zowe.yaml + +To enable observability, configure the OpenTelemetry exporter and resource attributes within your `zowe.yaml` file with the following structure: + +``` +zowe: + observability: + enabled: true + exporter: + otlp: + endpoint: "http://otel-collector.your.domain:4317" + protocol: "grpc" + timeout: 10000 + resource: + attributes: + service.name: "zowe-apiml" + service.namespace: "finance-production" + deployment.environment.name: "production" +``` + +## How the Export Works + +When `enabled: true` is set, the API ML single-service starts a background telemetry engine. This engine gathers all signals and bundles these signals with all Resource Attributes. These bundles are then pushed by means of the OTLP Exporter to your specified endpoint. + +## Validating the Configuration + +After applying the changes to zowe.yaml and restarting the API Mediation Layer, verify that the OpenTelemetry integration is active and communicating with your collector. + +1. Check the API ML Startup Logs. +Review the job logs for the API ML service. Upon successful initialization with observability enabled, look for messages indicating the OpenTelemetry SDK has started. + + To confirm successful initialization, review the log entries which confirm that the OTLP exporter has initialized and is attempting to connect to the specified endpoint. If the endpoint is unreachable or the protocol is mismatched, the logs will typically show Exporting failed or Connection refused messages from the OTel SDK. + +2. Verify Signal Reception in your Observability Tool. +The most definitive validation is to confirm that data is appearing in your chosen observability backend: + + a. Search by Service Name. + In your monitoring tool's UI, look for the value you defined in `service.name` (e.g., zowe-apiml). + + b. Filter by Namespace. + If you have multiple installations, use the `service.namespace` filter to isolate data from this specific instance. + +3. Confirm Attributes. +Select a trace or metric and verify that the Resource Attributes (such as `zos.smf.id` or `mainframe.lpar.name`) are correctly attached. + +4. Use the Collector's Logging (Optional). +If data is not appearing in the backend, check the logs of your OpenTelemetry Collector. If the collector is configured with the logging or debug exporter, you will see raw incoming "Export" requests from the API ML's IP address. \ No newline at end of file diff --git a/docs/user-guide/api-mediation/observability/observability-outline.md b/docs/user-guide/api-mediation/observability/observability-outline.md new file mode 100644 index 0000000000..78f136470c --- /dev/null +++ b/docs/user-guide/api-mediation/observability/observability-outline.md @@ -0,0 +1,15 @@ +# Outline of API ML Observability Topics + +The following files will be presented under Advanced server-side configuration under the **Install** tab: + +* Configuring API ML Observability via OpenTelemetry + * [Configuring OpenTelemetry service attributes](configuring-otel-service-attributes.md) + * [Configuring OpenTelemetry deployment attributes](configuring-otel-deployment-attributes.md) + * [Configuring OpenTelemetry z/OS attributes](configuring-otel-zos-attributes.md) + * [Enabling Observability in zowe.yaml](enabling-observability-in-zowe.yaml.md) + +The following files will be presented under Using Zowe API Mediation Layer under the **Use** tab: + +* [Using your API ML OpenTelemetry metrics](using-your-otel-metrics.md) + * [API ML Provided Observability Signals and Attributes](apiml-provided-observability-signals-and-attributes.md) + * [Sample Output from API ML OpenTelemetry](sample-output-from-apiml-otel.md) \ No newline at end of file diff --git a/docs/user-guide/api-mediation/observability/sample-output-from-apiml-otel.md b/docs/user-guide/api-mediation/observability/sample-output-from-apiml-otel.md new file mode 100644 index 0000000000..23372cdb9b --- /dev/null +++ b/docs/user-guide/api-mediation/observability/sample-output-from-apiml-otel.md @@ -0,0 +1,2 @@ +# Sample Output from API ML OpenTelemetry + diff --git a/docs/user-guide/api-mediation/observability/using-your-otel-metrics.md b/docs/user-guide/api-mediation/observability/using-your-otel-metrics.md new file mode 100644 index 0000000000..416e712123 --- /dev/null +++ b/docs/user-guide/api-mediation/observability/using-your-otel-metrics.md @@ -0,0 +1,7 @@ +# Using Your API ML OpenTelemetry Metrics + +## Examples of Usability of Telemetry data in API ML + +How a system administrator interacts with this data depends on the visualization tool used (e.g., Grafana, Jaeger, or Broadcom WatchTower). + + \ No newline at end of file diff --git a/sidebars.js b/sidebars.js index c77fe6a95c..947f8f9cb0 100644 --- a/sidebars.js +++ b/sidebars.js @@ -335,6 +335,16 @@ module.exports = { "extend/extend-apiml/api-mediation-redis" ] }, + { + "type": "category", + "label": "Configuring API ML Observability via OpenTelemetry", + "items": [ + "user-guide/api-mediation/observability/configuring-otel-service-attributes", + "user-guide/api-mediation/observability/configuring-otel-deployment-attributes", + "user-guide/api-mediation/observability/configuring-otel-zos-attributes", + "user-guide/api-mediation/observability/enabling-observability-in-zowe-yaml" + ] + }, "user-guide/api-mediation/configuration-customizing-the-api-catalog-ui", "user-guide/api-mediation/configuration-logging", "user-guide/api-mediation/wto-message-on-startup", @@ -527,6 +537,15 @@ module.exports = { "user-guide/api-mediation-change-password-via-catalog", ], }, + { + type: "category", + label: "Using your API ML OpenTelemetry metrics", + link: { "type": "doc", "id": "user-guide/api-mediation/observability/using-your-otel-metrics" }, + items: [ + "user-guide/api-mediation/observability/apiml-provided-observability-signals-and-attributes", + "user-guide/api-mediation/observability/sample-output-from-apiml-otel" + ], + }, "user-guide/api-mediation/api-mediation-update-password", "user-guide/api-mediation/api-mediation-smf", ],