Idea: Interactive UI Support for A2A #1294
lukaszkostrzewa
started this conversation in
Ideas
Replies: 1 comment 1 reply
-
|
Check out https://a2ui.org which should be coming soon. Some of the A2A TSC gave a talk on this here: https://youtu.be/plFvoMjZR6g?t=1138 They offer good discussion on UI structure support, why NOT iframes (which MCP UI asks for today), and the direction that this could lead adoption. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I'm exploring whether A2A should support a lightweight, standardized way for agents to return interactive UIs (HTML widgets, small apps, etc.) that another agent or client can render. Before going deeper, I’d like to understand whether this fits A2A’s philosophy and whether similar ideas have already been considered.
Why this matters
A2A is already multimodal, but there’s no shared way for agents to send visual or interactive components that can be rendered safely. Many human‑in‑the‑loop workflows benefit from richer interactions than plain text can offer—maps, forms, editors, inspectors, timelines, and other UI elements.
Standard and systems like the OpenAI Apps SDK and MCP Apps show that people increasingly expect chat environments to support interactive components that streamline decision‑making and reduce unclear back‑and‑forth.
A2A could benefit even more: the subagent often holds the domain knowledge and knows which UI the human needs next. Unlike MCP Apps—where the host agent must understand the domain and decide which tool or UI to use—A2A keeps this logic within the subagent. But currently, there’s no portable way for a subagent to express an interactive step.
This might be useful because:
Why it deserves an extension
While A2A agents can technically return HTML today, there’s no shared understanding of how it should be rendered, how it should communicate with the agent, or how clients should negotiate when to use text vs. an interactive UI. An extension could make these workflows consistent and predictable.
A few reasons why standardizing this might help:
Rough idea
A potential extension would define a new optional feature. If an agent declares support in its Agent Card, and the client requests it via Service Parameters, the agent MAY return HTML during an “input required” state.
This could work in two ways:
The agent could also include metadata:
Clients would render the UI inside a sandboxed iframe, passing host context such as theme, display mode, viewport size, and locale.
The extension would also define a lightweight communication protocol—likely
postMessage—and how UI events (clicks, form submissions, etc.) should be mapped into typed messages sent back to the agent.This is just a rough idea, but I’d love to shape it together with the A2A community. If interactive UI support aligns with A2A’s philosophy, it could open up more human‑friendly workflows and be worth exploring as a community‑driven extension.
Curious whether contributors see this as in‑scope, out‑of‑scope, or something worth experimenting with! 😃
Beta Was this translation helpful? Give feedback.
All reactions