Skip to content

Commit e04d5bf

Browse files
committed
feat: Add support for multi-modal messages
1 parent 0a3019c commit e04d5bf

File tree

22 files changed

+791
-470
lines changed

22 files changed

+791
-470
lines changed

CHANGELOG.md

Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,43 @@
1+
# Changelog
2+
3+
## v0.0.40 (Unreleased)
4+
5+
### Highlights
6+
- Added end-to-end multimodal user message support across TypeScript and Python SDKs, LangGraph integrations, and docs, including new `InputContent` schemas and example agents.
7+
- Introduced runtime improvements such as `connectAgent`, safer cloning, and run lifecycle fixes, plus expanded event metadata (`parentRunId`, embedded inputs).
8+
- Delivered a new `compactEvents` utility for consolidating streaming deltas and hardened backwards compatibility layers for legacy clients.
9+
- Updated integrations (LangGraph, Vercel AI SDK, Mastra, Google ADK) to translate multimodal content and align with the latest protocol expectations.
10+
- Bumped package versions to the `0.0.40` pre-release series (TypeScript) and `0.2.0a0` (Python) to ship these capabilities.
11+
12+
### TypeScript SDK
13+
- Extended `UserMessage` to accept multimodal `InputContent[]`, added `TextInputContent`/`BinaryInputContent` schemas, and exported the associated types for consumers.
14+
- Added optional `parentRunId` and embedded `input` payloads to `RunAgentInput` and `RunStartedEvent` schemas, plus surfaced a dedicated `AGUIConnectNotImplementedError`.
15+
- Refined `AbstractAgent` by making `run` protected, introducing `connectAgent`, tracking `isRunning`, and ensuring `clone` copies agent state; adjusted `HttpAgent`, Mastra, and Vercel agents to clone safely.
16+
- Updated `defaultApplyEvents` to handle non-string message content, merge `runStarted.input.messages` into local state, and improved event processing tests (`run-started-input`, cloning, multimodal, backwards compatibility).
17+
- Added and exported a `compactEvents` helper (with comprehensive tests) for consolidating streaming text/tool call deltas before replaying them to subscribers.
18+
- Improved legacy converters and integrations to flatten multimodal content when talking to text-only surfaces, guarding the behavior with new backwards compatibility tests.
19+
- Updated the React SDK chat surface to render multimodal user messages, including inline attachment previews and safer clipboard handling when no text is present.
20+
- Incremented package versions for `@ag-ui/core`, `@ag-ui/client`, CLI, encoder, and proto packages to `0.0.40-alpha.6`.
21+
22+
### Python SDK
23+
- Introduced `TextInputContent` and `BinaryInputContent` models, allowing `UserMessage` instances to carry ordered multimodal content alongside traditional strings.
24+
- Added optional `parent_run_id` and `input` fields to `RunStartedEvent` plus `parent_run_id` to `RunAgentInput`, mirroring TypeScript schema changes.
25+
- Relaxed the base model configuration to allow extra fields for backwards compatibility and added validation to ensure binary payloads provide an ID, URL, or data source.
26+
- Expanded the test suite to cover multimodal serialization, binary payload validation, run input parsing, and extra-field tolerance.
27+
- Documented multimodal usage in the Python README and bumped the `ag-ui-protocol` package to `0.2.0a0`.
28+
29+
### Integrations
30+
- **LangGraph (Python & TypeScript):** Added bidirectional converters for multimodal content, new vision-friendly example agents, updated tests, and bumped the package to `0.0.18a0` with matching dependency pins.
31+
- **Google ADK Middleware:** Flattened multimodal message content into text parts when translating to ADK events and updated helper utilities accordingly.
32+
- **Mastra:** Preserved constructor config when cloning agents and flattened AG-UI messages into the formats expected by Mastra clients.
33+
- **Vercel AI SDK:** Added safe cloning, converted user content into SDK-compatible parts, and ensured multimodal inputs degrade gracefully to text.
34+
- **General:** Updated converters and utilities across integrations to handle `InputContent` arrays without breaking existing text-only flows.
35+
36+
### Documentation
37+
- Updated core concepts and SDK reference docs to describe multimodal user messages, the new input content schemas, and extended `RunStartedEvent` properties.
38+
- Documented `connectAgent`, `connect`, and the replayable `events$` stream on `AbstractAgent`, clarifying how persistent connections are implemented.
39+
- Marked the multimodal messages specification as implemented (October 16, 2025) and added README snippets showing multimodal message creation in both SDKs.
40+
41+
### Miscellaneous
42+
- Added an explicit `@ts-expect-error` in the A2A middleware noting the intentional call to a protected method until a public API exists.
43+
- Updated Poetry lockfiles and dependency pins to align with the new Python package versions.

apps/angular/demo-server/package.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
"start": "node --env-file=.env --loader tsx src/index.ts"
99
},
1010
"dependencies": {
11-
"@ag-ui/client": "0.0.40-alpha.6",
11+
"@ag-ui/client": "0.0.40-alpha.7",
1212
"@ag-ui/langgraph": "^0.0.11",
1313
"@copilotkitnext/demo-agents": "workspace:^",
1414
"@copilotkitnext/runtime": "workspace:^",

apps/angular/storybook/package.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@
99
"storybook:build": "ng run storybook-angular:build-storybook"
1010
},
1111
"dependencies": {
12-
"@ag-ui/client": "0.0.40-alpha.6",
12+
"@ag-ui/client": "0.0.40-alpha.7",
1313
"@angular/animations": "^18.2.0",
1414
"@angular/common": "^18.2.0",
1515
"@angular/compiler": "^18.2.0",

apps/react/demo/package.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@
99
"lint": "next lint"
1010
},
1111
"dependencies": {
12-
"@ag-ui/client": "0.0.40-alpha.6",
12+
"@ag-ui/client": "0.0.40-alpha.7",
1313
"@copilotkitnext/agent": "workspace:*",
1414
"@copilotkitnext/core": "workspace:*",
1515
"@copilotkitnext/react": "workspace:*",

packages/agent/package.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,7 @@
3636
"vitest": "^3.0.5"
3737
},
3838
"dependencies": {
39-
"@ag-ui/client": "0.0.40-alpha.6",
39+
"@ag-ui/client": "0.0.40-alpha.7",
4040
"@ai-sdk/anthropic": "^2.0.22",
4141
"@ai-sdk/google": "^2.0.17",
4242
"@ai-sdk/openai": "^2.0.42",

packages/agent/src/__tests__/utils.test.ts

Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -90,6 +90,58 @@ describe("convertMessagesToVercelAISDKMessages", () => {
9090
]);
9191
});
9292

93+
it("should convert user messages with binary content", () => {
94+
const messages: Message[] = [
95+
{
96+
id: "1",
97+
role: "user",
98+
content: [
99+
{ type: "text", text: "Here is the design" },
100+
{
101+
type: "binary",
102+
mimeType: "image/png",
103+
url: "https://example.com/image.png",
104+
filename: "image.png",
105+
},
106+
],
107+
},
108+
];
109+
110+
const result = convertMessagesToVercelAISDKMessages(messages);
111+
const content = result[0].content;
112+
113+
expect(Array.isArray(content)).toBe(true);
114+
if (Array.isArray(content)) {
115+
expect(content[0]).toEqual({ type: "text", text: "Here is the design" });
116+
expect(content[1]).toMatchObject({
117+
type: "file",
118+
mediaType: "image/png",
119+
filename: "image.png",
120+
});
121+
}
122+
});
123+
124+
it("should fall back to placeholders when binary content has no data", () => {
125+
const messages: Message[] = [
126+
{
127+
id: "1",
128+
role: "user",
129+
content: [
130+
{
131+
type: "binary",
132+
mimeType: "application/octet-stream",
133+
id: "file-1",
134+
},
135+
],
136+
},
137+
];
138+
139+
const result = convertMessagesToVercelAISDKMessages(messages);
140+
expect(result[0].content).toEqual([
141+
{ type: "text", text: "[Attachment: file-1]" },
142+
]);
143+
});
144+
93145
it("should convert assistant messages with text content", () => {
94146
const messages: Message[] = [
95147
{

packages/agent/src/index.ts

Lines changed: 77 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
import {
22
AbstractAgent,
33
BaseEvent,
4+
BinaryInputContent,
45
RunAgentInput,
56
EventType,
67
Message,
@@ -22,9 +23,11 @@ import {
2223
AssistantModelMessage,
2324
UserModelMessage,
2425
ToolModelMessage,
26+
FilePart,
2527
ToolCallPart,
2628
ToolResultPart,
2729
TextPart,
30+
UserContent,
2831
tool as createVercelAISDKTool,
2932
ToolChoice,
3033
ToolSet,
@@ -232,6 +235,79 @@ export function defineTool<TParameters extends z.ZodTypeAny>(config: {
232235
};
233236
}
234237

238+
function convertBinaryInputContentToFilePart(content: BinaryInputContent): FilePart | null {
239+
if (content.url) {
240+
try {
241+
return {
242+
type: "file",
243+
data: new URL(content.url),
244+
mediaType: content.mimeType,
245+
filename: content.filename,
246+
} satisfies FilePart;
247+
} catch {
248+
return {
249+
type: "file",
250+
data: content.url,
251+
mediaType: content.mimeType,
252+
filename: content.filename,
253+
} satisfies FilePart;
254+
}
255+
}
256+
257+
if (content.data) {
258+
return {
259+
type: "file",
260+
data: content.data,
261+
mediaType: content.mimeType,
262+
filename: content.filename,
263+
} satisfies FilePart;
264+
}
265+
266+
return null;
267+
}
268+
269+
function convertUserMessageContent(content: Message["content"]): UserContent {
270+
if (!content) {
271+
return "";
272+
}
273+
274+
if (typeof content === "string") {
275+
return content;
276+
}
277+
278+
if (content.every((part) => part.type === "text")) {
279+
return content.map((part) => part.text).join("\n\n");
280+
}
281+
282+
const parts: Array<TextPart | FilePart> = [];
283+
284+
for (const part of content) {
285+
if (part.type === "text") {
286+
if (part.text.length > 0) {
287+
parts.push({ type: "text", text: part.text });
288+
}
289+
continue;
290+
}
291+
292+
const filePart = convertBinaryInputContentToFilePart(part);
293+
if (filePart) {
294+
parts.push(filePart);
295+
} else {
296+
const label = part.filename ?? part.id ?? part.mimeType;
297+
parts.push({
298+
type: "text",
299+
text: `[Attachment: ${label}]`,
300+
});
301+
}
302+
}
303+
304+
if (parts.length === 0) {
305+
return "";
306+
}
307+
308+
return parts;
309+
}
310+
235311
/**
236312
* Converts AG-UI messages to Vercel AI SDK ModelMessage format
237313
*/
@@ -260,7 +336,7 @@ export function convertMessagesToVercelAISDKMessages(messages: Message[]): Model
260336
} else if (message.role === "user") {
261337
const userMsg: UserModelMessage = {
262338
role: "user",
263-
content: message.content || "",
339+
content: convertUserMessageContent(message.content),
264340
};
265341
result.push(userMsg);
266342
} else if (message.role === "tool") {

packages/angular/package.json

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -31,8 +31,8 @@
3131
"test:watch": "vitest --watch"
3232
},
3333
"dependencies": {
34-
"@ag-ui/client": "0.0.40-alpha.6",
35-
"@ag-ui/core": "0.0.40-alpha.6",
34+
"@ag-ui/client": "0.0.40-alpha.7",
35+
"@ag-ui/core": "0.0.40-alpha.7",
3636
"@copilotkitnext/core": "workspace:*",
3737
"@copilotkitnext/shared": "workspace:*",
3838
"clsx": "^2.1.1",

packages/angular/src/lib/components/chat/copilot-chat-user-message-renderer.ts

Lines changed: 90 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,11 @@ import {
77
} from "@angular/core";
88
import { CommonModule } from "@angular/common";
99
import { cn } from "../../utils";
10+
import type { BinaryInputContent, InputContent } from "@ag-ui/client";
11+
import {
12+
getUserMessageBinaryContents,
13+
getUserMessageTextContent,
14+
} from "@copilotkitnext/shared";
1015

1116
@Component({
1217
selector: "copilot-chat-user-message-renderer",
@@ -17,10 +22,54 @@ import { cn } from "../../utils";
1722
host: {
1823
"[class]": "computedClass()",
1924
},
20-
template: `{{ content() }}`,
25+
template: `
26+
@if (textContent()) {
27+
<span>{{ textContent() }}</span>
28+
}
29+
@if (attachments().length) {
30+
<div [class]="attachmentsClass()">
31+
@for (attachment of attachments(); track trackAttachment(attachment, $index)) {
32+
<ng-container *ngIf="isImage(attachment); else fileTemplate">
33+
<figure class="flex flex-col gap-1">
34+
<img
35+
[src]="resolveSource(attachment)"
36+
[alt]="attachment.filename || attachment.id || attachment.mimeType"
37+
class="max-h-64 rounded-lg border border-border object-contain"
38+
/>
39+
@if (attachment.filename || attachment.id) {
40+
<figcaption class="text-xs text-muted-foreground">
41+
{{ attachment.filename || attachment.id }}
42+
</figcaption>
43+
}
44+
</figure>
45+
</ng-container>
46+
<ng-template #fileTemplate>
47+
<div class="rounded-md border border-dashed border-border bg-muted/70 px-3 py-2 text-xs text-muted-foreground">
48+
{{ attachment.filename || attachment.id || 'Attachment' }}
49+
<span class="block text-[10px] uppercase tracking-wide text-muted-foreground/70">
50+
{{ attachment.mimeType }}
51+
</span>
52+
@if (resolveSource(attachment) && !isImage(attachment)) {
53+
<a
54+
[href]="resolveSource(attachment)"
55+
target="_blank"
56+
rel="noreferrer"
57+
class="mt-1 block text-xs text-primary underline"
58+
>
59+
Open
60+
</a>
61+
}
62+
</div>
63+
</ng-template>
64+
}
65+
</div>
66+
}
67+
`,
2168
})
2269
export class CopilotChatUserMessageRenderer {
2370
readonly content = input<string>("");
71+
readonly contents = input<InputContent[]>([]);
72+
readonly attachments = input<BinaryInputContent[] | undefined>(undefined);
2473
readonly inputClass = input<string | undefined>();
2574

2675
readonly computedClass = computed(() => {
@@ -29,4 +78,44 @@ export class CopilotChatUserMessageRenderer {
2978
this.inputClass()
3079
);
3180
});
81+
82+
readonly textContent = computed(() => {
83+
const explicit = this.content();
84+
if (explicit && explicit.length > 0) {
85+
return explicit;
86+
}
87+
return getUserMessageTextContent(this.contents());
88+
});
89+
90+
readonly attachments = computed(() => {
91+
const provided = this.attachments() ?? [];
92+
if (provided.length > 0) {
93+
return provided;
94+
}
95+
return getUserMessageBinaryContents(this.contents());
96+
});
97+
98+
readonly attachmentsClass = computed(() =>
99+
this.textContent().trim().length > 0
100+
? "mt-3 flex flex-col gap-2"
101+
: "flex flex-col gap-2",
102+
);
103+
104+
resolveSource(attachment: BinaryInputContent): string | null {
105+
if (attachment.url) {
106+
return attachment.url;
107+
}
108+
if (attachment.data) {
109+
return `data:${attachment.mimeType};base64,${attachment.data}`;
110+
}
111+
return null;
112+
}
113+
114+
isImage(attachment: BinaryInputContent): boolean {
115+
return attachment.mimeType.startsWith("image/") && !!this.resolveSource(attachment);
116+
}
117+
118+
trackAttachment(attachment: BinaryInputContent, index: number): string {
119+
return attachment.id ?? attachment.url ?? attachment.filename ?? index.toString();
120+
}
32121
}

0 commit comments

Comments
 (0)