Skip to content

Commit ed5e470

Browse files
committed
docs: Add RFD for agent-guided user selection feature
Proposes `session/select` request allowing agents to present interactive menus with prompts and markdown-renderable options. Supports single/multi-select modes with optional free-text input for gathering structured user input and exposing agent capabilities in a discoverable way.
1 parent c534dc9 commit ed5e470

File tree

1 file changed

+259
-0
lines changed

1 file changed

+259
-0
lines changed
Lines changed: 259 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,259 @@
1+
---
2+
title: "Agent-Guided User Selection"
3+
---
4+
5+
Author(s): [your-handle](https://github.com/your-handle)
6+
7+
## Elevator pitch
8+
9+
> What are you proposing to change?
10+
11+
Allow agents to dynamically present interactive menus to users during a session. These menus consist of a prompt (question) and a set of markdown-renderable options, enabling agents to guide users through workflows, gather structured input, and expose agent-specific actions in a discoverable way.
12+
13+
## Status quo
14+
15+
> How do things work today and what problems does this cause? Why would we change things?
16+
17+
Currently, agents have limited mechanisms for soliciting structured input from users:
18+
19+
1. **Free-form prompts only**: Agents must rely on natural language responses, which can be ambiguous and require additional parsing/validation.
20+
21+
2. **No discoverability**: Users don't know what options or capabilities an agent supports unless explicitly told through conversation. There's no standardized way to present available actions.
22+
23+
3. **Guided workflows are cumbersome**: Multi-step processes require agents to describe options in prose and hope users respond with recognizable input. This leads to friction and error-prone interactions.
24+
25+
4. **Agent-specific actions are hidden**: Agents with specialized capabilities (e.g., deployment options, code generation styles, environment configurations) have no structured way to expose these to users.
26+
27+
5. **Context-dependent options require explanation**: When available actions change based on project state, file type, or session context, agents must repeatedly explain what's possible.
28+
29+
## What we propose to do about it
30+
31+
> What are you proposing to improve the situation?
32+
33+
Introduce a new agent-to-client request that allows agents to present interactive menus to users. Key characteristics:
34+
35+
- **Agent-initiated**: The agent sends a request to the client with a prompt and options
36+
- **Markdown-renderable options**: Each option can include rich markdown content for clear presentation
37+
- **Configurable selection mode**: Agent specifies whether single-select or multi-select is allowed
38+
- **Optional free-text input**: Agent can enable an "other" option allowing users to provide custom input
39+
- **Callback-based response**: The client returns the user's selection(s) back to the agent via a dedicated response mechanism
40+
- **Dynamic timing**: Menus can be presented at session start, during conversations, or based on context changes
41+
42+
## Shiny future
43+
44+
> How will things will play out once this feature exists?
45+
46+
Once implemented, agents can create rich, guided experiences:
47+
48+
- **Onboarding flows**: New users are presented with setup options rather than needing to know what to ask
49+
- **Workflow wizards**: Multi-step processes become intuitive click-through experiences
50+
- **Context-aware suggestions**: As users work, agents surface relevant actions ("I noticed you're in a test file - would you like to: Run tests / Generate test cases / View coverage")
51+
- **Configuration dialogs**: Complex agent settings can be presented as structured choices rather than requiring users to remember syntax
52+
- **Domain-specific actions**: Specialized agents (CI/CD, database, cloud deployment) can expose their unique capabilities in discoverable menus
53+
54+
Users get a more guided, less error-prone experience. Agents get structured, unambiguous input. Clients can render these menus in ways that fit their UI paradigm (dropdowns, modal dialogs, inline buttons, etc.).
55+
56+
## Implementation details and plan
57+
58+
> Tell me more about your implementation. What is your detailed implementation plan?
59+
60+
<!--
61+
Note: This section is OPTIONAL when RFDs are first opened.
62+
The following is a strawman proposal to seed discussion.
63+
-->
64+
65+
### Protocol Changes
66+
67+
This proposal follows the same pattern as `session/request_permission`, providing a consistent interaction model for agent-initiated user input.
68+
69+
#### New Request: `session/select`
70+
71+
The agent sends this request to the client to present a selection menu and await user response:
72+
73+
```json
74+
{
75+
"jsonrpc": "2.0",
76+
"id": 5,
77+
"method": "session/select",
78+
"params": {
79+
"sessionId": "sess_abc123def456",
80+
"prompt": "How would you like to proceed with the refactoring?",
81+
"options": [
82+
{
83+
"optionId": "inline",
84+
"name": "Inline refactor",
85+
"description": "Refactor in place, modifying existing files"
86+
},
87+
{
88+
"optionId": "new-files",
89+
"name": "Create new files",
90+
"description": "Generate refactored code in new files, preserving originals"
91+
},
92+
{
93+
"optionId": "dry-run",
94+
"name": "Dry run",
95+
"description": "Show what would change without making modifications"
96+
}
97+
],
98+
"selectionMode": "single",
99+
"allowFreeText": true,
100+
"freeTextPlaceholder": "Or describe a different approach..."
101+
}
102+
}
103+
```
104+
105+
**Request Parameters:**
106+
107+
- `sessionId` *(SessionId, required)*: The session ID for this request.
108+
- `prompt` *(string, required)*: The question or instruction to display to the user. Supports markdown.
109+
- `options` *(SelectionOption[], required)*: Available [selection options](#selection-options) for the user to choose from.
110+
- `selectionMode` *(SelectionMode, required)*: Whether the user can select one option (`single`) or multiple options (`multiple`).
111+
- `allowFreeText` *(boolean, optional)*: If `true`, the client should provide a free-text input field in addition to the options.
112+
- `freeTextPlaceholder` *(string, optional)*: Placeholder text to display in the free-text input field (if enabled).
113+
114+
#### Response
115+
116+
The client responds with the user's selection, following the same outcome pattern as `session/request_permission`:
117+
118+
```json
119+
{
120+
"jsonrpc": "2.0",
121+
"id": 5,
122+
"result": {
123+
"outcome": {
124+
"outcome": "selected",
125+
"optionIds": ["inline"],
126+
"freeText": null
127+
}
128+
}
129+
}
130+
```
131+
132+
If the prompt turn is cancelled before the user responds, the client **MUST** respond with the `cancelled` outcome:
133+
134+
```json
135+
{
136+
"jsonrpc": "2.0",
137+
"id": 5,
138+
"result": {
139+
"outcome": {
140+
"outcome": "cancelled"
141+
}
142+
}
143+
}
144+
```
145+
146+
**Response Fields:**
147+
148+
- `outcome` *(SelectionOutcome, required)*: The user's decision, either:
149+
- `cancelled` - The [prompt turn was cancelled](./prompt-turn#cancellation)
150+
- `selected` with `optionIds` - The IDs of the selected option(s)
151+
- `selected` with `freeText` - Custom text provided by the user (if `allowFreeText` was enabled)
152+
153+
### Selection Options
154+
155+
Each selection option provided to the Client contains:
156+
157+
- `optionId` *(string, required)*: Unique identifier for this option.
158+
- `name` *(string, required)*: Human-readable label to display to the user.
159+
- `description` *(string, optional)*: Extended description of this option. Supports markdown for rich formatting.
160+
161+
### Selection Mode
162+
163+
Controls how many options the user can select:
164+
165+
- `single` - User must select exactly one option
166+
- `multiple` - User can select one or more options (checkboxes)
167+
168+
### Example: Multi-Select with Free Text
169+
170+
```json
171+
{
172+
"jsonrpc": "2.0",
173+
"id": 6,
174+
"method": "session/select",
175+
"params": {
176+
"sessionId": "sess_abc123def456",
177+
"prompt": "Which files should I include in the review?",
178+
"options": [
179+
{
180+
"optionId": "modified",
181+
"name": "Modified files",
182+
"description": "Files changed in this branch"
183+
},
184+
{
185+
"optionId": "tests",
186+
"name": "Test files",
187+
"description": "Include related test files"
188+
},
189+
{
190+
"optionId": "deps",
191+
"name": "Dependencies",
192+
"description": "Include files that depend on modified files"
193+
}
194+
],
195+
"selectionMode": "multiple",
196+
"allowFreeText": true,
197+
"freeTextPlaceholder": "Or specify file paths..."
198+
}
199+
}
200+
```
201+
202+
Response with multiple selections:
203+
204+
```json
205+
{
206+
"jsonrpc": "2.0",
207+
"id": 6,
208+
"result": {
209+
"outcome": {
210+
"outcome": "selected",
211+
"optionIds": ["modified", "tests"],
212+
"freeText": null
213+
}
214+
}
215+
}
216+
```
217+
218+
### Considerations
219+
220+
- **Default selection**: Should agents be able to pre-select an option? Could add an optional `defaultOptionIds` field.
221+
- **Validation**: For multi-select, should there be min/max selection constraints?
222+
- **Grouping**: Should options support grouping/categories for complex menus?
223+
224+
## Frequently asked questions
225+
226+
> What questions have arisen over the course of authoring this document or during subsequent discussions?
227+
228+
### What alternative approaches did you consider, and why did you settle on this one?
229+
230+
1. **Extending slash commands**: We considered making slash commands more dynamic, but this doesn't solve the "agent needs to ask a question" use case - slash commands are user-initiated.
231+
232+
2. **Structured content blocks**: We considered adding menu-like content to `session/update` messages, but this conflates display with interaction. A dedicated request/response pattern provides clearer semantics for "agent needs input."
233+
234+
3. **Form-based approach**: A full form system (text fields, checkboxes, etc.) was considered but adds significant complexity. Menus with optional free-text cover the 80% case while remaining simple.
235+
236+
### Why not extend `session/request_permission` for this?
237+
238+
While `session/select` follows a similar interaction pattern to `session/request_permission`, the permission system is tightly coupled to tool calls via the required `toolCallId` field. This makes it unsuitable for general-purpose user input gathering where no tool call is involved.
239+
240+
`session/select` provides a standalone mechanism for agents to gather structured input at any point during a session—whether for onboarding, workflow decisions, or configuration—without requiring a tool call context.
241+
242+
### What if the client doesn't support rich rendering?
243+
244+
Clients should gracefully degrade. At minimum, options can be rendered as a numbered list in plain text. The `description` field is optional, so basic implementations can show just labels.
245+
246+
### How is cancellation handled?
247+
248+
Following the same pattern as `session/request_permission`:
249+
250+
- If the client sends a `session/cancel` notification to cancel an ongoing prompt turn, it **MUST** respond to all pending `session/select` requests with the `cancelled` outcome.
251+
- The agent should handle cancellation gracefully, typically by aborting the current workflow or falling back to a default behavior.
252+
253+
### Can the user dismiss the menu without selecting?
254+
255+
If the user dismisses the menu (e.g., clicks outside, presses Escape), the client **SHOULD** treat this as a cancellation and return the `cancelled` outcome. This provides consistent behavior and allows agents to handle the case explicitly.
256+
257+
## Revision history
258+
259+
<!-- If there have been major updates to this RFD, you can include the git revisions and a summary of the changes. -->

0 commit comments

Comments
 (0)