Skip to content

Commit 777a5b4

Browse files
docs(rfd): Draft: Agent Telemetry Export (#298)
* docs(rfd): Draft: Agent Telemetry Export * Update website * format * add champion --------- Co-authored-by: Ben Brandt <[email protected]>
1 parent 748853d commit 777a5b4

File tree

3 files changed

+176
-1
lines changed

3 files changed

+176
-1
lines changed

docs/docs.json

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -110,7 +110,8 @@
110110
"rfds/request-cancellation",
111111
"rfds/session-resume",
112112
"rfds/meta-propagation",
113-
"rfds/session-info-update"
113+
"rfds/session-info-update",
114+
"rfds/agent-telemetry-export"
114115
]
115116
},
116117
{ "group": "Preview", "pages": [] },
Lines changed: 167 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,167 @@
1+
---
2+
title: "Agent Telemetry Export"
3+
---
4+
5+
- Author(s): [@codefromthecrypt](https://github.com/codefromthecrypt)
6+
- Champion: [@benbrandt](https://github.com/benbrandt)
7+
8+
## Elevator pitch
9+
10+
> What are you proposing to change?
11+
12+
Define how agents export telemetry (logs, metrics, traces) to clients without tunneling it over the ACP transport. Clients run a local telemetry receiver and pass standard OpenTelemetry environment variables when launching agents. This keeps telemetry out-of-band and enables editors to display agent activity, debug issues, and integrate with observability backends.
13+
14+
## Status quo
15+
16+
> How do things work today and what problems does this cause? Why would we change things?
17+
18+
ACP defines how clients launch agents as subprocesses and communicate over stdio. The [meta-propagation RFD](./meta-propagation) addresses trace context propagation via `params._meta`, enabling trace correlation. However, there is no convention for how agents should export the actual telemetry data (spans, metrics, logs).
19+
20+
Without a standard approach:
21+
22+
1. **No visibility into agent behavior** - Editors cannot display what agents are doing (token usage, tool calls, timing)
23+
2. **Difficult debugging** - When agents fail, there's no structured way to capture diagnostics
24+
3. **Fragmented solutions** - Each agent/client pair invents their own telemetry mechanism
25+
4. **Credential exposure risk** - If agents need to send telemetry directly to backends, they need credentials
26+
27+
Tunneling telemetry over the ACP stdio transport is problematic:
28+
29+
- **Head-of-line blocking** - Telemetry traffic could delay agent messages
30+
- **Implementation burden** - ACP would need to define telemetry message formats
31+
- **Coupling** - Agents would need ACP-specific telemetry code instead of standard SDKs
32+
33+
## What we propose to do about it
34+
35+
> What are you proposing to improve the situation?
36+
37+
Clients that want to receive agent telemetry run a local OTLP (OpenTelemetry Protocol) receiver and inject environment variables when launching agent subprocesses:
38+
39+
```
40+
OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318
41+
OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf
42+
OTEL_SERVICE_NAME=agent-name
43+
```
44+
45+
Agents using OpenTelemetry SDKs auto-configure from these variables. The client's receiver can:
46+
47+
- Display telemetry in the editor UI (e.g., token counts, timing, errors)
48+
- Forward telemetry to the client's configured observability backend
49+
- Add client-side context before forwarding
50+
51+
This follows the [OpenTelemetry collector deployment pattern](https://opentelemetry.io/docs/collector/deployment/agent/) where a local receiver proxies telemetry to backends.
52+
53+
### Architecture
54+
55+
```
56+
┌────────────────────────────────────────────────────────────┐
57+
│ Client/Editor │
58+
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
59+
│ │ ACP Handler │ │OTLP Receiver │───▶│ Exporter │ │
60+
│ └──────────────┘ └──────────────┘ └──────────────┘ │
61+
└────────┬─────────────────────▲──────────────────┬──────────┘
62+
│ stdio │ HTTP │
63+
▼ │ ▼
64+
┌─────────────────────┐ │ ┌───────────────────┐
65+
│ Agent Process │ │ │ Observability │
66+
│ ┌──────────────┐ │ │ │ Backend │
67+
│ │ ACP Agent │ │ │ └───────────────────┘
68+
│ ├──────────────┤ │ │
69+
│ │ OTEL SDK │────────────┘
70+
│ └──────────────┘ │
71+
└─────────────────────┘
72+
```
73+
74+
### Discovery
75+
76+
Environment variables must be set before launching the subprocess, but ACP capability exchange happens after connection. Options for discovery:
77+
78+
1. **Optimistic injection** - Clients inject OTEL environment variables unconditionally. Agents without OpenTelemetry support simply ignore them. This is pragmatic since environment variables are low-cost and OTEL SDKs handle misconfiguration gracefully.
79+
80+
2. **Registry metadata** - Agent registries (like the one proposed in PR #289) could include telemetry support in agent manifests, letting clients know ahead of time.
81+
82+
3. **Manual configuration** - Users configure their client to enable telemetry collection for specific agents.
83+
84+
## Shiny future
85+
86+
> How will things will play out once this feature exists?
87+
88+
1. **Editor integration** - Editors can show agent activity: token usage, tool call timing, model switches, errors
89+
2. **Unified debugging** - When agents fail, structured telemetry is available for diagnosis
90+
3. **End-to-end traces** - Combined with `params._meta` trace propagation, traces flow from client through agent to any downstream services
91+
4. **No credential sharing** - Agents never see backend credentials; the client handles authentication
92+
5. **Standard SDKs** - Agent authors use normal OpenTelemetry SDKs that work in any context, not ACP-specific code
93+
94+
## Implementation details
95+
96+
> Tell me more about your implementation. What is your detailed implementation plan?
97+
98+
### 1. Create `docs/protocol/observability.mdx`
99+
100+
Add a new protocol documentation page covering observability practices for ACP. This page will describe:
101+
102+
**For Clients/Editors:**
103+
104+
- Running an OTLP receiver to collect agent telemetry
105+
- Injecting `OTEL_EXPORTER_*` environment variables when launching agent subprocesses
106+
- Respecting user-configured `OTEL_*` variables (do not override if already set)
107+
- Forwarding telemetry to configured backends with client credentials
108+
109+
**For Agent Authors:**
110+
111+
- Using OpenTelemetry SDKs with standard auto-configuration
112+
- Recommended spans, metrics, and log patterns for agent operations
113+
- How telemetry flows when `OTEL_*` variables are present vs absent
114+
115+
### 2. Update `docs/protocol/extensibility.mdx`
116+
117+
Add a section linking to the new observability doc, similar to how extensibility concepts relate to other protocol features. Add a brief mention that observability practices (telemetry export) are documented separately.
118+
119+
### 3. Update `docs/docs.json`
120+
121+
Add `protocol/observability` to the Protocol navigation group.
122+
123+
## Frequently asked questions
124+
125+
> What questions have arisen over the course of authoring this document or during subsequent discussions?
126+
127+
### How does this relate to trace propagation in `params._meta`?
128+
129+
They are complementary:
130+
131+
- **Trace propagation** (`params._meta` with `traceparent`, etc.) passes trace context so spans can be correlated
132+
- **Telemetry export** (this RFD) defines where agents send the actual span/metric/log data
133+
134+
Both are needed for end-to-end observability.
135+
136+
### What if an agent doesn't use OpenTelemetry?
137+
138+
Agents without OTEL SDKs simply ignore the environment variables. No harm is done. Over time, as more agents adopt OpenTelemetry, the ecosystem benefits.
139+
140+
### What if the user already configured `OTEL_*` environment variables?
141+
142+
If `OTEL_*` variables are already set in the environment, clients should not override them. User-configured telemetry settings take precedence, allowing users to direct agent telemetry to their own backends when desired.
143+
144+
### Why not define ACP-specific telemetry messages?
145+
146+
This would duplicate OTLP functionality, add implementation burden to ACP, and force agent authors to use non-standard APIs. Using OTLP means agents work with standard tooling and documentation.
147+
148+
### What about agents that aren't launched as subprocesses?
149+
150+
This RFD focuses on the stdio transport where clients launch agents. For other transports (HTTP, etc.), agents would need alternative configuration mechanisms, which could be addressed in future RFDs.
151+
152+
### What alternative approaches did you consider, and why did you settle on this one?
153+
154+
1. **Tunneling telemetry over ACP** - Rejected due to head-of-line blocking concerns and implementation complexity
155+
2. **Agents export directly to backends** - Rejected because it requires sharing credentials with agents
156+
3. **File-based telemetry** - Rejected because it doesn't support real-time display and adds complexity
157+
158+
The environment variable approach:
159+
160+
- Uses existing standards (OTLP, OpenTelemetry SDK conventions)
161+
- Keeps telemetry out-of-band from ACP messages
162+
- Lets clients control where telemetry goes without exposing credentials
163+
- Requires no changes to ACP message formats
164+
165+
## Revision history
166+
167+
- 2025-12-04: Initial draft

docs/updates.mdx

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,13 @@ description: Updates and announcements about the Agent Client Protocol
44
rss: true
55
---
66

7+
<Update label="December 11, 2025" tags={["RFD"]}>
8+
## Agent Telemetry Export RFD moves to Draft stage
9+
10+
The RFD for providing more guidance on how agents should export telemetry has been moved to Draft stage. Please review the [RFD](./rfds/agent-telemetry-export) for more information on the current proposal and provide feedback as work on the implementation begins.
11+
12+
</Update>
13+
714
<Update label="December 3, 2025" tags={["RFD"]}>
815
## session_info_update notification RFD moves to Draft stage
916

0 commit comments

Comments
 (0)