-
Notifications
You must be signed in to change notification settings - Fork 749
Open
Labels
RoadmapThis feature or functionality should be added to the roadmap.This feature or functionality should be added to the roadmap.proposalIf you'd like to propose adding something to the roadmapIf you'd like to propose adding something to the roadmap
Description
Currently, AG-UI only supports text-based messages. However, in recent practical agent applications, the ability to handle multimodal input/output messages—even general files—has become increasingly critical. Imagine an agent designed for creating presentations: users input text, images, and video materials, and the agent retrieves relevant information to generate a complete PowerPoint file. While using custom extensions can temporarily support such functionality, it introduces fragmentation at the protocol level.
I propose establishing certain standardized specifications at the protocol layer to unify support and avoid confusion.
Supported Content:
User Input:
- Text
- Audio
- Images
- General Files
Agent Output (AssistantMessage
, UserMessage
, ToolMessage
):
- Text
- Audio
- Images
- General Files
eemelipa and widike
Metadata
Metadata
Assignees
Labels
RoadmapThis feature or functionality should be added to the roadmap.This feature or functionality should be added to the roadmap.proposalIf you'd like to propose adding something to the roadmapIf you'd like to propose adding something to the roadmap
Type
Projects
Status
In progress