OpenTelemetry-based tracing for the Claude Prompts MCP server. Provides production-grade observability with data safety guarantees.
Add to config.json:
{
"telemetry": {
"enabled": true,
"mode": "traces",
"exporterEndpoint": "http://localhost:4318",
"samplingRate": 1.0
}
}Requires an OTLP-compatible collector (e.g., OpenTelemetry Collector) running at the configured endpoint.
| Key | Type | Default | Description |
|---|---|---|---|
telemetry.enabled |
boolean | false |
Master switch. No SDK initialized when false. |
telemetry.mode |
"off" | "traces" | "full" |
"off" |
traces = spans only. full = spans + metrics (future). |
telemetry.exporterEndpoint |
string | http://localhost:4318 |
OTLP HTTP endpoint URL. |
telemetry.samplingRate |
number | 1.0 |
Head sampling rate (0.0-1.0). Use collector tail sampling for production. |
telemetry.attributePolicy.businessContext |
boolean | true |
Include safe business attributes (prompt ID, execution mode, etc.). |
telemetry.attributePolicy.rawCommands |
boolean | false |
Include raw command text. Warning: may contain sensitive data. |
telemetry.attributePolicy.rawResponses |
boolean | false |
Include raw user responses. Warning: may contain sensitive data. |
telemetry.attributePolicy.allowlist |
string[] | — | Custom attribute names to include. |
# Enable telemetry
system_control set telemetry.enabled true
system_control set telemetry.mode traces
# Set collector endpoint
system_control set telemetry.exporterEndpoint http://collector:4318
# Adjust sampling
system_control set telemetry.samplingRate 0.1Note: telemetry.mode and telemetry.exporterEndpoint changes require server restart.
| Span Name | Type | Description |
|---|---|---|
prompt_engine.request |
Root | Wraps entire pipeline execution |
pipeline.stage.<StageName> |
Child | Per-stage timing (19-23 stages depending on pipeline config) |
Events are attached to the active root span via the hook system:
| Event Name | Source | Status | Description |
|---|---|---|---|
gate.passed |
Gate evaluation | Active | Gate passed validation |
gate.failed |
Gate evaluation | Active | Gate failed validation |
gate.retry_exhausted |
Gate system | Active | All retry attempts consumed |
gate.response_blocked |
Gate system | Active | Response blocked due to gate failure |
chain.step_complete |
Chain execution | Planned | Chain step finished |
chain.complete |
Chain execution | Planned | Full chain completed |
chain.failed |
Chain execution | Planned | Chain execution failed |
Note
Chain events are defined in the hook registry and observer but not yet emitted by chain operator code. Gate and pipeline stage events are fully active.
Safe business-context attributes on trace spans (all prefixed cpm.*):
| Attribute | Type | Description |
|---|---|---|
cpm.command.type |
string | Command type or prompt ID |
cpm.execution.mode |
string | single or chain |
cpm.execution.id |
string | Unique execution identifier |
These attributes follow the wide-event pattern — one comprehensive event per request with full business context for incident queries.
| Attribute | Type | Description | Incident Query |
|---|---|---|---|
cpm.duration.total_ms |
number | Total pipeline duration (ms) | "Show slow requests" |
cpm.stages.executed_count |
number | Number of stages that ran | "Pipeline utilization" |
cpm.stages.skipped |
string | Comma-separated skipped stage names | "Why didn't X run?" |
cpm.stages.slowest |
string | Name of the slowest stage | "What's the bottleneck?" |
cpm.stages.slowest_ms |
number | Duration of slowest stage (ms) | "How slow was the bottleneck?" |
cpm.had_early_exit |
boolean | Whether all stages executed | "Incomplete executions" |
cpm.gates.names |
string | Comma-separated applied gate IDs | "Show failures by gate" |
cpm.gates.passed_count |
number | Gates that passed | "Gate pass rate" |
cpm.gates.failed_count |
number | Gates that failed | "Which gates fail most?" |
cpm.gates.blocked |
boolean | Response blocked by gate | "Show blocked requests" |
cpm.gates.retry_exhausted |
boolean | Retry attempts exhausted | "Retry exhaustion rate" |
cpm.gates.enforcement_mode |
string | Gate enforcement mode | "Enforcement mode distribution" |
cpm.chain.is_chain |
boolean | Whether this is a chain execution | "Chain vs single failure rate" |
cpm.chain.step_index |
number | Current chain step number | "Which step fails?" |
cpm.chain.id |
string | Chain session identifier | "Chain execution timeline" |
cpm.framework.id |
string | Active methodology ID | "Failures by methodology" |
cpm.framework.enabled |
boolean | Whether methodology is active | "Framework adoption" |
cpm.scope.source |
string | Identity scope source | "Scope distribution" |
cpm.error.type |
string | Error message (on failure only) | "Error grouping" |
| Attribute | Type | Description |
|---|---|---|
cpm.prompt.id |
string | Resolved prompt identifier |
cpm.operator.types |
string | Applied operator types |
cpm.chain.current_step |
number | Current chain step |
cpm.chain.total_steps |
number | Total chain steps |
cpm.gates.applied_count |
number | Number of applied gates |
These attributes are never emitted unless explicitly opted in via attributePolicy:
- Raw command text (
cpm.command.raw) - Raw user responses (
cpm.user_response.raw) - Rendered prompt/template bodies
- Raw model outputs
The AttributePolicyEnforcer applies 7 rules to every attribute before trace emission:
- Excluded attributes always blocked — raw payloads never leak by default
- Business-context gated by config —
attributePolicy.businessContextflag - Custom allowlist override — explicitly listed attributes always pass
- Raw commands opt-in — only if
attributePolicy.rawCommands: true - Raw responses opt-in — only if
attributePolicy.rawResponses: true - Unknown
cpm.*dropped — unrecognized custom attributes rejected - Non-
cpm.*pass through — OTel semantic conventions (http., rpc.) are unfiltered
Metric labels are restricted to low-cardinality values:
Allowed: stage_name, stage_type, status, execution_mode, is_chain
Disallowed: prompt_id, chain_id, continuity_scope_id, operator_types, execution_id, session_id, framework_id
High-cardinality identifiers are trace-only attributes, not metric dimensions.
When MCP resources are enabled, telemetry status is available at:
resource://telemetry/status
Returns: { enabled, mode, exporterEndpoint, samplingRate, uptimeMs }
| Symptom | Cause | Fix |
|---|---|---|
| No traces exported | Telemetry disabled | Set telemetry.enabled: true and telemetry.mode: "traces" |
| Traces don't reach collector | Wrong endpoint | Verify telemetry.exporterEndpoint matches collector address |
| Missing business attributes | Policy disabled | Check telemetry.attributePolicy.businessContext: true |
| Server startup failure | Collector unreachable | Telemetry degrades gracefully — check logs for [TelemetryRuntime] warnings |
| Raw data in traces | Opt-in enabled | Set attributePolicy.rawCommands: false and rawResponses: false |
Config (telemetry section)
|
v
TelemetryLifecycle (runtime/telemetry-lifecycle.ts)
|
+-- TelemetryRuntimeImpl (SDK init, tracer access)
+-- AttributePolicyEnforcer (data safety)
+-- TelemetryHookObserver (hook -> trace events)
|
v
HookRegistry (event fan-out)
^
|
Pipeline (emitBeforeStage/emitAfterStage/emitStageError)
|
+-- Root span: prompt_engine.request
| +-- 3 initial attrs (creation)
| +-- 19 wide-event attrs (enrichRootSpan at completion)
+-- Stage spans: pipeline.stage.<Name>
+-- Gate events: gate.passed/failed/blocked (via hooks)
The root span follows the wide-event pattern — enriched at pipeline completion with performance, gate, chain, framework, and error attributes. See Wide-Event Attributes for the full list.