AgentOS Observability (OpenTelemetry)

AgentOS provides opt-in OpenTelemetry (OTEL) spans, metrics, and log correlation/export hooks. AgentOS itself does not start an OTEL SDK. Your host application owns exporters, sampling, and context propagation.

Overview
Opt-In Policy
Enable via AgentOS Config (Recommended)
Enable via Environment Variables
Host Setup (Node.js Example)
What AgentOS Emits
Logging (Pino + OTEL Logs)
Privacy & Cardinality
Performance Notes
SOTA Techniques (TypeScript Agentic AI)

Overview

Defaults (all OFF):

Manual AgentOS spans
AgentOS metrics
Trace IDs in streamed responses
Log correlation (trace_id, span_id)
OTEL LogRecord export

When enabled, AgentOS emits:

Spans around turns and tool-result handling
Metrics for turn/tool counters + histograms
Optional: trace correlation in logs and streamed response metadata
Optional: OTEL LogRecords (exported by your host, via OTLP)

Opt-In Policy

There are two layers:

Host OTEL SDK (required for export)
- In Node: @opentelemetry/sdk-node + exporters/instrumentations.
- In browsers: the web OTEL SDK (if you choose to export from the client).
AgentOS instrumentation toggles (controls what AgentOS emits)
- Env flags (global defaults)
- AgentOSConfig.observability (per-agent control)

Precedence:

AgentOSConfig.observability.enabled = false hard-disables all AgentOS observability helpers (even if env is set).
Otherwise, config fields override env fields, and env provides defaults.

Enable via AgentOS Config (Recommended)

import { AgentOS } from '@framers/agentos';

const agentos = new AgentOS();
await agentos.initialize({
  // ...your normal config...
  observability: {
    // Master switch: when true, defaults to enabling tracing/metrics + log correlation.
    // Keep explicit per-signal toggles if you want a tighter blast radius.
    // enabled: true,

    tracing: {
      enabled: true,
      includeTraceInResponses: true, // adds metadata.trace to select streamed chunks
    },
    metrics: {
      enabled: true,
    },
    logging: {
      includeTraceIds: true, // adds trace_id/span_id to pino log meta
      exportToOtel: false,   // keep OFF unless you want OTLP log export
      // otelLoggerName: '@framers/agentos',
    },
  },
});

Enable via Environment Variables

# Master switch: enables tracing + metrics + log trace_id/span_id injection defaults
AGENTOS_OBSERVABILITY_ENABLED=true

# Optional fine-grained toggles
AGENTOS_TRACING_ENABLED=true
AGENTOS_METRICS_ENABLED=true
AGENTOS_TRACE_IDS_IN_RESPONSES=true
AGENTOS_LOG_TRACE_IDS=true

# Optional: emit OTEL LogRecords (still requires a host SDK + logs exporter)
AGENTOS_OTEL_LOGS_ENABLED=true

# Names (advanced; usually keep defaults)
AGENTOS_OTEL_TRACER_NAME=@framers/agentos
AGENTOS_OTEL_METER_NAME=@framers/agentos
AGENTOS_OTEL_LOGGER_NAME=@framers/agentos

Host Setup (Node.js Example)

AgentOS only uses OTEL APIs; your host must install/start an OTEL SDK to export anything.

Typical Node setup:

Install dependencies:

npm install @opentelemetry/sdk-node @opentelemetry/auto-instrumentations-node

Configure env (OTLP/HTTP collector example):

OTEL_SERVICE_NAME=my-agent-host
OTEL_TRACES_EXPORTER=otlp
OTEL_METRICS_EXPORTER=otlp
# OTEL_LOGS_EXPORTER=otlp  # keep explicit opt-in for log export
OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318
OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf

# Optional sampling
OTEL_TRACES_SAMPLER=parentbased_traceidratio
OTEL_TRACES_SAMPLER_ARG=0.1

Start the SDK early (before most imports) so auto-instrumentation can patch libraries:

// otel.ts
import { NodeSDK } from '@opentelemetry/sdk-node';
import { getNodeAutoInstrumentations } from '@opentelemetry/auto-instrumentations-node';

const sdk = new NodeSDK({
  instrumentations: [getNodeAutoInstrumentations()],
});

export async function startOtel(): Promise<void> {
  await sdk.start();
}

export async function shutdownOtel(): Promise<void> {
  await sdk.shutdown();
}

In this monorepo, the backend bootstrap lives at backend/src/observability/otel.ts.

What AgentOS Emits

Spans

When tracing is enabled, AgentOS emits spans such as:

agentos.turn
agentos.gmi.get_or_create
agentos.gmi.process_turn_stream
agentos.tool_result
agentos.gmi.handle_tool_result
agentos.conversation.save (stage-tagged)

Metrics

When metrics are enabled, AgentOS records:

agentos.turns (counter)
agentos.turn.duration_ms (histogram)
agentos.turn.tokens.total|prompt|completion (histograms; only when usage is available)
agentos.turn.cost.usd (histogram; only when cost is available)
agentos.tool_results (counter)
agentos.tool_result.duration_ms (histogram)

Trace IDs in Streamed Responses

When enabled, AgentOS attaches trace metadata to select streamed chunks:

{
  "metadata": {
    "trace": {
      "traceId": "...",
      "spanId": "...",
      "traceparent": "00-...-...-01"
    }
  }
}

Logging (Pino + OTEL Logs)

Stdout Logs (Default)

AgentOS uses pino for structured logs. When includeTraceIds is enabled and an active span exists, AgentOS adds:

trace_id
span_id

to log metadata to make correlation easy in any log backend.

OTEL LogRecord Export (Optional)

If you want logs to flow through the OTLP pipeline (instead of, or in addition to, stdout shipping):

Enable AgentOS OTEL log emission:
- AGENTOS_OTEL_LOGS_ENABLED=true, or
- observability.logging.exportToOtel = true
Enable a host logs exporter (Node example):

OTEL_LOGS_EXPORTER=otlp
OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318
OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf

Recommendation:

Keep OTEL log export OFF by default.
Use stdout logs + trace correlation for most deployments.
Turn on OTEL log export when you explicitly want one unified OTLP pipeline for traces/metrics/logs.

Privacy & Cardinality

Defaults are conservative:

Prompts, model outputs, and tool arguments are not recorded by default.
Prefer safe metadata only (durations, status, tool names, model/provider ids, token usage, cost).

Cardinality guidance:

Avoid labeling metrics/spans with high-cardinality values (user ids, conversation ids, raw URLs, prompt text).
Keep attributes stable and low-cardinality (e.g. status, tool_name, persona_id).

Performance Notes

AgentOS observability helpers are a safe no-op when disabled.
Traces/metrics are typically low overhead when sampling is enabled.
OTEL log export can add noticeable CPU/network overhead at debug volume; use it intentionally.

SOTA Techniques (TypeScript Agentic AI)

Patterns that work well in practice:

Structured event stream: emit agent lifecycle events (turn started, tool called, tool returned, policy decision, final output) as strongly-typed records (AgentOS already streams chunks; persist them if you need audits).
W3C context propagation: propagate traceparent across inbound HTTP, SSE/WebSocket streaming, and tool calls; use OTEL context managers (AsyncLocalStorage) in Node.
GenAI semantic conventions: add gen_ai.* attributes/events to spans when instrumenting model calls, tool calls, and token usage; keep raw content behind explicit opt-in and redaction.
Redaction and data classification: treat prompt/tool args/output as sensitive by default; add allowlists + hashing for debugging without content exfiltration.

Common library choices:

Telemetry: @opentelemetry/sdk-node, @opentelemetry/auto-instrumentations-node
Logging: pino (+ @opentelemetry/instrumentation-pino if you want automatic injection everywhere)
Agent/LLM observability layers: Langfuse, Helicone, Sentry AI monitoring, OpenLIT, OpenLLMetry-js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AgentOS Observability (OpenTelemetry)

Table of Contents

Overview

Opt-In Policy

Enable via AgentOS Config (Recommended)

Enable via Environment Variables

Host Setup (Node.js Example)

What AgentOS Emits

Spans

Metrics

Trace IDs in Streamed Responses

Logging (Pino + OTEL Logs)

Stdout Logs (Default)

OTEL LogRecord Export (Optional)

Privacy & Cardinality

Performance Notes

SOTA Techniques (TypeScript Agentic AI)

FilesExpand file tree

OBSERVABILITY.md

Latest commit

History

OBSERVABILITY.md

File metadata and controls

AgentOS Observability (OpenTelemetry)

Table of Contents

Overview

Opt-In Policy

Enable via AgentOS Config (Recommended)

Enable via Environment Variables

Host Setup (Node.js Example)

What AgentOS Emits

Spans

Metrics

Trace IDs in Streamed Responses

Logging (Pino + OTEL Logs)

Stdout Logs (Default)

OTEL LogRecord Export (Optional)

Privacy & Cardinality

Performance Notes

SOTA Techniques (TypeScript Agentic AI)