Idea: Interactive UI Support for A2A #1294

lukaszkostrzewa · 2025-12-09T10:01:39Z

lukaszkostrzewa
Dec 9, 2025

I'm exploring whether A2A should support a lightweight, standardized way for agents to return interactive UIs (HTML widgets, small apps, etc.) that another agent or client can render. Before going deeper, I’d like to understand whether this fits A2A’s philosophy and whether similar ideas have already been considered.

Why this matters

A2A is already multimodal, but there’s no shared way for agents to send visual or interactive components that can be rendered safely. Many human‑in‑the‑loop workflows benefit from richer interactions than plain text can offer—maps, forms, editors, inspectors, timelines, and other UI elements.

Standard and systems like the OpenAI Apps SDK and MCP Apps show that people increasingly expect chat environments to support interactive components that streamline decision‑making and reduce unclear back‑and‑forth.

A2A could benefit even more: the subagent often holds the domain knowledge and knows which UI the human needs next. Unlike MCP Apps—where the host agent must understand the domain and decide which tool or UI to use—A2A keeps this logic within the subagent. But currently, there’s no portable way for a subagent to express an interactive step.

This might be useful because:

It improves human‑in‑the‑loop workflows
It enables richer interactions for complex tasks (maps, editors, graphs, etc.)
Subagents can guide workflows without the host needing domain‑specific knowledge

Why it deserves an extension

While A2A agents can technically return HTML today, there’s no shared understanding of how it should be rendered, how it should communicate with the agent, or how clients should negotiate when to use text vs. an interactive UI. An extension could make these workflows consistent and predictable.

A few reasons why standardizing this might help:

Interoperability: any agent can return a widget; any compatible client can render it.
Clear capability negotiation: agents can advertise support (e.g., HTML widgets, sandboxed iframes).
Security consistency: shared sandbox rules and message patterns—stronger overall security when designed into the protocol rather than added ad‑hoc.
Accessibility support: negotiation can consider end‑user accessibility preferences, not just client and agent capabilities.

Rough idea

A potential extension would define a new optional feature. If an agent declares support in its Agent Card, and the client requests it via Service Parameters, the agent MAY return HTML during an “input required” state.

This could work in two ways:

The agent returns the HTML content directly, or
The agent returns a URI pointing to a hosted component (potentially improving caching)

The agent could also include metadata:

Recommended Content Security Policy settings
Expected domain/origin for sandboxing
Visual boundary or styling preferences
Structured JSON describing the component’s input data

Clients would render the UI inside a sandboxed iframe, passing host context such as theme, display mode, viewport size, and locale.

The extension would also define a lightweight communication protocol—likely postMessage—and how UI events (clicks, form submissions, etc.) should be mapped into typed messages sent back to the agent.

This is just a rough idea, but I’d love to shape it together with the A2A community. If interactive UI support aligns with A2A’s philosophy, it could open up more human‑friendly workflows and be worth exploring as a community‑driven extension.

Curious whether contributors see this as in‑scope, out‑of‑scope, or something worth experimenting with! 😃

vinoo999 · 2025-12-09T10:44:07Z

vinoo999
Dec 9, 2025

Check out https://a2ui.org which should be coming soon. Some of the A2A TSC gave a talk on this here:

https://youtu.be/plFvoMjZR6g?t=1138

They offer good discussion on UI structure support, why NOT iframes (which MCP UI asks for today), and the direction that this could lead adoption.

1 reply

lukaszkostrzewa Dec 11, 2025
Author

Thanks, Vinay—really appreciate the links! I hadn’t seen those before, and A2UI looks very new and very promising.

From what I’ve gathered so far, the high-level goal of A2UI seems aligned with what I was thinking: giving agents a standardized way to express richer, interactive UIs. But the approach is quite different from MCP Apps or the OpenAI Apps SDK. A few differences that stood out to me:

The UI is generated on-the-fly by the LLM rather than relying on pre-built, static components.
The client advertises which components it supports, and the agent selects from that set (similar concept to AG-UI).
Instead of returning full HTML, the agent provides a component layout (“micro-DOM / micro-HTML,” as Alan described in the talk) that the client renders with its own component library.

If I understand correctly, it’s essentially a structured UI descriptor—potentially JSON-like—that delegates the actual rendering to the client. This would explain why an iframe isn’t needed and suggests A2UI is not (just) an A2A extension but more of a UI toolkit.

I’ve joined the waitlist and am super-excited to see how A2UI evolves. Thanks again for the links!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Idea: Interactive UI Support for A2A #1294

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Idea: Interactive UI Support for A2A #1294

Uh oh!

lukaszkostrzewa Dec 9, 2025

Why this matters

Why it deserves an extension

Rough idea

Replies: 1 comment · 1 reply

Uh oh!

vinoo999 Dec 9, 2025

Uh oh!

lukaszkostrzewa Dec 11, 2025 Author

lukaszkostrzewa
Dec 9, 2025

Replies: 1 comment 1 reply

vinoo999
Dec 9, 2025

lukaszkostrzewa Dec 11, 2025
Author