RFC: OpenSearch SQL/PPL Telemetry Integration

## Problem Statement

OpenSearch SQL/PPL plugin has no integration with OpenSearch's core telemetry framework. There is no distributed tracing for query execution, making it difficult to diagnose latency issues across the parse → optimize → compile → execute pipeline. The existing metrics implementation uses a custom `Metrics` singleton with `BasicCounter`/`RollingCounter` exposed via `/_plugins/_ppl/stats` and `/_plugins/_sql/stats`, which is disconnected from the standard OpenSearch telemetry export pipeline (OTel SDK → OTLP → observability backends).

## Goals

- **P0**: Add distributed tracing spans to SQL/PPL query execution pipeline
- **P1**: Migrate existing custom metrics to OpenSearch's `MetricsRegistry` (OTel-backed)

## Non-Goals

- Changing the telemetry framework itself
- Adding telemetry to the sandbox analytics-engine (Calcite-based prototype)

## Background

### OpenSearch Telemetry Framework

OpenSearch provides a backend-agnostic telemetry framework:

- **`libs/telemetry/`** — interfaces: `Tracer`, `Span`, `SpanScope`, `MetricsRegistry`, `Counter`, `Histogram`
- **`server/`** — wiring: `TelemetryModule`, `TracerFactory`, `WrappedTracer`, `SpanBuilder`
- **`plugins/telemetry-otel/`** — OTel SDK implementation that exports via `BatchSpanProcessor` (traces) and `PeriodicMetricReader` (metrics)

Plugins access the framework by implementing `TelemetryAwarePlugin`, which injects `Tracer` and `MetricsRegistry` via `createComponents()`.

### Current SQL/PPL Metrics

The SQL plugin (opensearch-project/sql) uses a custom metrics system:

| Component | Description |
|-----------|-------------|
| `Metrics` singleton | Global metrics registry with `getInstance()` |
| `MetricName` enum | `REQ_TOTAL`, `REQ_COUNT_TOTAL`, `FAILED_REQ_COUNT_*`, `PPL_REQ_TOTAL`, `PPL_REQ_COUNT_TOTAL`, etc. |
| `BasicCounter` / `RollingCounter` | Custom counter implementations |
| `RestSQLStatsAction` / `RestPPLStatsAction` | Exposes metrics via `/_plugins/_sql/stats` and `/_plugins/_ppl/stats` |

These metrics are not exported to OTel backends and cannot be correlated with OpenSearch server-level telemetry.

### SQL/PPL Query Execution Pipeline

```
PPL text
  → Parse (PPLSyntaxParser → UnresolvedPlan AST)
  → Analyze (CalciteRelNodeVisitor → RelNode logical plan)
  → Push-down Optimize (PushDownPlanner → mixed plan with boundary nodes)
  → Compile (OpenSearchQueryCompiler → PreparedStatement)
  → Execute (PreparedStatement.executeQuery() → ResultSet)
  → Materialize (ResultSet → PPLResponse)
```

## P0: Distributed Tracing

### Interface Change

The SQL plugin class must implement `TelemetryAwarePlugin`:

```java
public class SQLPlugin extends Plugin implements ScriptPlugin,
    ActionPlugin, TelemetryAwarePlugin {

    private Tracer tracer;
    private MetricsRegistry metricsRegistry;

    @Override
    public Collection<Object> createComponents(
        ...,
        Tracer tracer,
        MetricsRegistry metricsRegistry
    ) {
        this.tracer = tracer;
        this.metricsRegistry = metricsRegistry;
        // pass to internal components
    }
}
```

Note: `TelemetryAwarePlugin` is annotated `@ExperimentalApi` — the interface may evolve.

### Span Hierarchy

Root span + 6 child spans per query (7 total):

```
sql/ppl.query                          ← root span, SpanKind.SERVER (TransportAction.doExecute)
  ├── sql/ppl.parse                    ← INTERNAL: PPLSyntaxParser + AST build
  ├── sql/ppl.analyze                  ← INTERNAL: CalciteRelNodeVisitor → logical plan
  ├── sql/ppl.optimize                 ← INTERNAL: PushDownPlanner push-down rules
  ├── sql/ppl.compile                  ← INTERNAL: OpenSearchQueryCompiler → PreparedStatement
  ├── sql/ppl.execute                  ← INTERNAL: PreparedStatement.executeQuery()
  │     └── opensearch.search          ← child spans from OpenSearch search (automatic)
  └── sql/ppl.materialize              ← INTERNAL: ResultSet → response
```

Only the root span uses `SpanKind.SERVER` (it represents a received request). All phase spans use `SpanKind.INTERNAL` — they are in-process operations, not network boundaries.

### Span Attributes

| Attribute | Example | Span |
|-----------|---------|------|
| `db.query.text` | `source=logs \| where ... \| stats count() by host` (sanitized — literals/values stripped) | `sql/ppl.query` |
| `db.query.type` | `ppl` or `sql` | `sql/ppl.query` |
| `db.query.id` | `a1b2c3d4` | `sql/ppl.query` |
| `db.collection.name` | `logs` | `sql/ppl.query` |
| `db.query.datasource` | `prometheus`, `s3`, `opensearch` | `sql/ppl.query` |
| `ppl.plan.node_count` | `7` | `sql/ppl.optimize` |
| `ppl.plan.pushed_down` | `filter,aggregation` | `sql/ppl.optimize` |
| `ppl.execute.rows` | `1024` | `sql/ppl.materialize` |
| `ppl.cache.hit` | `true` / `false` | `sql/ppl.compile` |
| `error` | `true` | any span on failure |
| `error.type` | `SemanticCheckException` | any span on failure |

Attribute naming follows [OTel semantic conventions for database spans](https://opentelemetry.io/docs/specs/semconv/database/) where applicable (`db.*` prefix).

### Implementation Pattern

Follow `TransportSearchAction` pattern:

```java
// TransportPPLQueryAction.doExecute()
Span span = tracer.startSpan(
    SpanCreationContext.server().name("sql/ppl.query")
        .attributes(Attributes.create()
            .addAttribute("db.query.type", "ppl")
            .addAttribute("db.query.text", sanitize(request.getQuery())))
);
try (SpanScope scope = tracer.withSpanInScope(span)) {
    ActionListener<PPLQueryResponse> tracedListener =
        TraceableActionListener.create(listener, span, tracer);
    queryService.execute(request, tracedListener);
} catch (Exception e) {
    span.setError(e);
    span.endSpan();
    listener.onFailure(e);
}
```

Phase-level spans inside `UnifiedQueryService.execute()`:

```java
// Each phase follows this pattern
Span parseSpan = tracer.startSpan(
    SpanCreationContext.internal().name("sql/ppl.parse"));
try (SpanScope s = tracer.withSpanInScope(parseSpan)) {
    ast = parser.parse(query);
} catch (Exception e) {
    parseSpan.setError(e);
    throw e;
} finally {
    parseSpan.endSpan();
}
```

### Graceful Degradation

When telemetry is disabled (default), `Tracer` is `NoopTracer` — all span operations are no-ops with near-zero overhead. No conditional checks needed in application code.

### Feature Flag Dependency

The telemetry framework is gated behind `FeatureFlags.TELEMETRY`, which defaults to `false`. When disabled, `TelemetryAwarePlugin.createComponents()` is not called — the plugin falls back to the base `Plugin.createComponents()` path which does not receive `Tracer` or `MetricsRegistry`.

The SQL plugin must handle both paths:
- Implement both `Plugin.createComponents()` and `TelemetryAwarePlugin.createComponents()`
- When telemetry flag is off, default to `NoopTracer` and `NoopMetricsRegistry`
- This ensures the plugin works regardless of the feature flag state

Prerequisite: the telemetry feature flag must be flipped to `true` by default in a target release for this work to be useful in production.

### API Stability Risk

`TelemetryAwarePlugin` is annotated `@ExperimentalApi` — the interface may change without semver guarantees within any release. Since the SQL plugin lives in a separate repository (opensearch-project/sql) with its own release cycle, a core OpenSearch patch release could break the SQL plugin's telemetry integration. Track the `@ExperimentalApi` → `@PublicApi` promotion timeline; defer P1 metrics migration until the interface stabilizes.

### Async Span Propagation

Span context propagates via `ThreadContextBasedTracerContextStorage` on thread pool hops. Considerations:

- **Cursor-based pagination**: Each page fetch is a separate transport roundtrip. The root `sql/ppl.query` span covers only the first execution. Subsequent cursor fetches create new root spans linked by a `cursor.id` attribute — not one long-lived span.
- **Thread pool hops**: SQL/PPL execution must go through OpenSearch's `ThreadContext`-aware thread pools for automatic span propagation. Custom `CompletableFuture` chains or raw executors will silently lose span context.
- **Cross-node push-down**: Operations pushed to data nodes are traced by OpenSearch's transport-layer instrumentation automatically. The coordinating node's `sql/ppl.execute` span becomes the parent of the downstream `opensearch.search` spans.

## P1: Metrics Migration

### Current → New Mapping

| Current (custom) | New (MetricsRegistry) | Type |
|-------------------|-----------------------|------|
| `REQ_TOTAL` / `PPL_REQ_TOTAL` | `sql.query.total` / `ppl.query.total` | Counter |
| `REQ_COUNT_TOTAL` / `PPL_REQ_COUNT_TOTAL` | `sql.query.count` / `ppl.query.count` | Counter |
| `FAILED_REQ_COUNT_SYS` | `sql.query.error{type=system}` | Counter |
| `FAILED_REQ_COUNT_CUS` | `ppl.query.error{type=client}` | Counter |
| `FAILED_REQ_COUNT_CB` | `ppl.query.error{type=circuit_breaker}` | Counter |
| (new) | `ppl.query.latency` | Histogram |
| (new) | `ppl.query.parse.latency` | Histogram |
| (new) | `ppl.query.optimize.latency` | Histogram |
| (new) | `ppl.query.execute.latency` | Histogram |

### Implementation

Create a dedicated metrics class following `ClusterManagerMetrics` pattern:

```java
public class PPLQueryMetrics {
    private final Counter queryTotal;
    private final Counter queryErrorTotal;
    private final Histogram queryLatency;
    private final Histogram parseLatency;
    private final Histogram executeLatency;

    public PPLQueryMetrics(MetricsRegistry metricsRegistry) {
        this.queryTotal = metricsRegistry.createCounter(
            "ppl.query.total", "Total PPL queries", "1");
        this.queryLatency = metricsRegistry.createHistogram(
            "ppl.query.latency", "PPL query latency", "ms");
        // ...
    }

    public void recordQuery(long latencyMs, boolean success) {
        queryTotal.add(1);
        queryLatency.record(latencyMs);
        if (!success) queryErrorTotal.add(1, Tags.create().addTag("type", errorType));
    }
}
```

### Migration Strategy

1. Add new `MetricsRegistry`-based metrics alongside existing custom metrics (dual-write)
2. Deprecate `/_plugins/_ppl/stats` and `/_plugins/_sql/stats` endpoints
3. Remove custom `Metrics` singleton in a future major version

### Backward Compatibility

- The `/_plugins/_ppl/stats` and `/_plugins/_sql/stats` endpoints continue to work during dual-write phase
- New metrics are exported via the standard telemetry pipeline (OTel → OTLP) when `telemetry.feature.metrics.enabled=true`
- No behavior change when telemetry is disabled — `MetricsRegistry` returns `NoopCounter`/`NoopHistogram`

## Performance

7 spans per query. At 10K QPS = 70K spans/sec. Mitigations:

- When telemetry disabled (default): `NoopTracer` — near-zero overhead, `TraceableActionListener.create()` short-circuits when `tracer.isRecording() == false`
- When enabled: overhead is bounded by OTel SDK's `BatchSpanProcessor` (async, non-blocking). Sampling rate (`telemetry.tracer.sampler.probability`, default 1%) limits actual export volume
- Benchmark requirement: measure p50/p99 query latency with telemetry enabled vs disabled before merging Phase 2 (phase-level spans). Acceptable overhead target: <2% p99 regression

## Testing Strategy

- Unit tests: verify spans are created with correct names, attributes, and parent-child relationships using `MockTracer` / `MockMetricsRegistry`
- Integration tests: verify end-to-end span export with `telemetry-otel` plugin and `LoggingSpanExporter`
- NoopTracer path: verify no NPEs or behavioral changes when telemetry feature flag is off
- Performance: benchmark query latency with/without telemetry on http_logs workload before each rollout phase

## Rollout Plan

| Phase | Scope | Gate to Next Phase |
|-------|-------|--------------------|
| Phase 1 | `TelemetryAwarePlugin` interface + top-level `sql/ppl.query` span | Unit tests pass, NoopTracer path verified, no p99 regression |
| Phase 2 | Phase-level child spans (parse, analyze, optimize, compile, execute, materialize) | Benchmark: <2% p99 latency regression with telemetry enabled on http_logs workload |
| Phase 3 | Migrate counters to `MetricsRegistry` (dual-write with existing custom metrics) | Existing `/_plugins/_ppl/stats` values match new OTel counters in integration test |
| Phase 4 | Add latency histograms per phase | Dashboard prototype confirms histograms produce actionable percentiles |
| Phase 5 | Deprecate custom stats endpoints, remove `Metrics` singleton | One major version deprecation notice. Removal in next major version. |


Component	Description
`Metrics` singleton	Global metrics registry with `getInstance()`
`MetricName` enum	`REQ_TOTAL`, `REQ_COUNT_TOTAL`, `FAILED_REQ_COUNT_*`, `PPL_REQ_TOTAL`, `PPL_REQ_COUNT_TOTAL`, etc.
`BasicCounter` / `RollingCounter`	Custom counter implementations
`RestSQLStatsAction` / `RestPPLStatsAction`	Exposes metrics via `/_plugins/_sql/stats` and `/_plugins/_ppl/stats`

Attribute	Example	Span
`db.query.text`	`source=logs \| where ... \| stats count() by host` (sanitized — literals/values stripped)	`sql/ppl.query`
`db.query.type`	`ppl` or `sql`	`sql/ppl.query`
`db.query.id`	`a1b2c3d4`	`sql/ppl.query`
`db.collection.name`	`logs`	`sql/ppl.query`
`db.query.datasource`	`prometheus`, `s3`, `opensearch`	`sql/ppl.query`
`ppl.plan.node_count`	`7`	`sql/ppl.optimize`
`ppl.plan.pushed_down`	`filter,aggregation`	`sql/ppl.optimize`
`ppl.execute.rows`	`1024`	`sql/ppl.materialize`
`ppl.cache.hit`	`true` / `false`	`sql/ppl.compile`
`error`	`true`	any span on failure
`error.type`	`SemanticCheckException`	any span on failure

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: OpenSearch SQL/PPL Telemetry Integration #5300

Problem Statement

Goals

Non-Goals

Background

OpenSearch Telemetry Framework

Current SQL/PPL Metrics

SQL/PPL Query Execution Pipeline

P0: Distributed Tracing

Interface Change

Span Hierarchy

Span Attributes

Implementation Pattern

Graceful Degradation

Feature Flag Dependency

API Stability Risk

Async Span Propagation

P1: Metrics Migration

Current → New Mapping

Implementation

Migration Strategy

Backward Compatibility

Performance

Testing Strategy

Rollout Plan

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Current (custom)	New (MetricsRegistry)	Type
`REQ_TOTAL` / `PPL_REQ_TOTAL`	`sql.query.total` / `ppl.query.total`	Counter
`REQ_COUNT_TOTAL` / `PPL_REQ_COUNT_TOTAL`	`sql.query.count` / `ppl.query.count`	Counter
`FAILED_REQ_COUNT_SYS`	`sql.query.error{type=system}`	Counter
`FAILED_REQ_COUNT_CUS`	`ppl.query.error{type=client}`	Counter
`FAILED_REQ_COUNT_CB`	`ppl.query.error{type=circuit_breaker}`	Counter
(new)	`ppl.query.latency`	Histogram
(new)	`ppl.query.parse.latency`	Histogram
(new)	`ppl.query.optimize.latency`	Histogram
(new)	`ppl.query.execute.latency`	Histogram

Phase	Scope	Gate to Next Phase
Phase 1	`TelemetryAwarePlugin` interface + top-level `sql/ppl.query` span	Unit tests pass, NoopTracer path verified, no p99 regression
Phase 2	Phase-level child spans (parse, analyze, optimize, compile, execute, materialize)	Benchmark: <2% p99 latency regression with telemetry enabled on http_logs workload
Phase 3	Migrate counters to `MetricsRegistry` (dual-write with existing custom metrics)	Existing `/_plugins/_ppl/stats` values match new OTel counters in integration test
Phase 4	Add latency histograms per phase	Dashboard prototype confirms histograms produce actionable percentiles
Phase 5	Deprecate custom stats endpoints, remove `Metrics` singleton	One major version deprecation notice. Removal in next major version.

RFC: OpenSearch SQL/PPL Telemetry Integration #5300

Description

Problem Statement

Goals

Non-Goals

Background

OpenSearch Telemetry Framework

Current SQL/PPL Metrics

SQL/PPL Query Execution Pipeline

P0: Distributed Tracing

Interface Change

Span Hierarchy

Span Attributes

Implementation Pattern

Graceful Degradation

Feature Flag Dependency

API Stability Risk

Async Span Propagation

P1: Metrics Migration

Current → New Mapping

Implementation

Migration Strategy

Backward Compatibility

Performance

Testing Strategy

Rollout Plan

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions