Skip to content

Commit 87552a9

Browse files
jkomorosclaudebfollington
authored
fix(llm-dialog): Filter empty text content blocks to prevent API errors (commontoolsinc#2034)
* fix(llm-dialog): Filter empty text content blocks to prevent API errors ## Problem After enabling tool call caching (d394ec8, Nov 6), users encountered 'text content blocks must be non-empty' errors when: 1. Running demo-setup Execute button multiple times 2. Using omnibot after tool calls 3. Any cached conversation with tool calls Error from Anthropic API: ``` AI_APICallError: messages: text content blocks must be non-empty ``` ## Root Cause When Claude responds with tool calls, it can include empty text parts: ```json { "role": "assistant", "content": [ {"type": "text", "text": ""}, // Empty! {"type": "tool_use", ...} ] } ``` Before tool call caching was enabled, these messages weren't cached so the issue didn't surface. After caching was enabled: 1. First request: LLM returns empty text + tool call 2. Tool results added to conversation 3. Response cached with empty text parts 4. Next request: Cached messages (with empty text) sent to API 5. API rejects: "text content blocks must be non-empty" Anthropic API recently became stricter about rejecting empty text blocks. ## The Fix **Two-point defense:** 1. **buildAssistantMessage (line 611-613)**: Filter out empty text parts when constructing assistant messages from LLM responses 2. **Request construction (lines 1069-1082)**: Filter all messages before sending to API to remove any empty text content that may have been cached Both filters check: - Part is type "text" - Part has non-null text field - Text is non-empty after trimming ## Why Both Filters Are Needed - Filter #1: Prevents storing empty text parts initially - Filter #2: Defense in depth for cached messages that already have empty text ## Testing Verified fix resolves: - ✅ demo-setup Execute button works on first and subsequent runs - ✅ demo-setup Reset + Execute works without errors - ✅ Tool calls execute successfully - ✅ Conversation continues after tool calls - ✅ No API errors in logs ## Related - Introduced by: d394ec8 "Allow caching of tool calls" - Workaround used in patterns: Cache-busting with Date.now() timestamps - Those workarounds can now be removed (though they're harmless) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]> * fix(llm-dialog): Handle invalid responses at source instead of just filtering ## Problem with Previous Fix The previous fix filtered empty text blocks, but could create invalid states: - Message with only empty text: `content: [{type:"text", text:""}]` - After filtering: `content: []` ← Invalid (except for final message) ## Root Cause Vercel AI SDK v5.x bug: returns empty text blocks before tool calls. Critical scenario: stream aborts after empty text but BEFORE tool calls. Result: 1. Stream sends: {type:"text", text:""} 2. Stream crashes before tool-call events 3. We store: content: [{type:"text", text:""}] (no tool calls!) 4. Gets cached 5. Next request: filter creates content: [] 6. API rejects: "messages must have non-empty content" ## New Solution ### 1. Validate Fresh Responses Check if response has valid content before storing: - If invalid (empty/only whitespace), insert proper error message - Provides user feedback: "I encountered an error generating a response..." - Maintains valid message history ### 2. Enhanced Cached Message Filtering - Filter empty text blocks from cached messages - Remove messages that become empty after filtering - Respect API rule: final assistant message CAN be empty ### 3. Remove Unnecessary Filtering Removed filtering from buildAssistantMessage since we now validate upstream. ## Benefits - Handles root cause (aborted/incomplete responses) - User gets clear error message instead of silent failure - Logs warnings for debugging - Never sends invalid payloads to Anthropic API - Backward compatible with already-cached invalid messages 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]> * fix(llm-dialog): Handle invalid responses and tool result edge cases ## Problems Fixed 1. **Empty/invalid LLM responses**: Stream aborts after empty text but before tool calls 2. **Tool results with no output**: Tools returning undefined/null broke API validation 3. **Tool result filtering**: Cached tool result messages could be removed incorrectly 4. **Mismatched tool calls/results**: If executeToolCalls fails partially ## Root Causes **Vercel AI SDK bug**: Returns empty text blocks before tool calls: ```json { "content": [ { "type": "text", "text": "" }, // ← SDK bug { "type": "tool-call", ... } ] } ``` **Critical scenario**: Stream crashes after empty text but BEFORE tool calls: - Receive: `content: [{type:"text", text:""}]` (no tool calls!) - Store as-is, gets cached - Next request: Filter creates `content: []` ← Invalid! - API rejects: "messages must have non-empty content" **Tool result validation**: Anthropic API **requires** tool_result for every tool_use: - If tool returns undefined/null, output field was undefined - Error: "tool_use ids were found without tool_result blocks" ## Solution ### 1. Validate Fresh Responses Check if response has valid content before storing: ```typescript function hasValidContent(content): boolean { // Returns true only if content has non-empty text OR tool calls OR tool results } if (!hasValidContent(llmResult.content)) { // Insert proper error message instead of invalid content content: "I encountered an error generating a response. Please try again." } ``` ### 2. Ensure Tool Results Always Valid Convert undefined/null results to explicit null value: ```typescript if (toolResult.result === undefined || toolResult.result === null) { output = { type: "json", value: null }; } ``` ### 3. Never Filter Tool Messages Tool result messages must always be preserved: ```typescript if (msg.role === "tool") return true; // Never filter ``` ### 4. Validate Tool Call/Result Match Ensure every tool call has a corresponding result: ```typescript if (toolResults.length !== toolCallParts.length) { // Log detailed error, insert error message } ``` ### 5. Enhanced Cached Message Filtering - Filter empty text blocks from cached messages (backward compat) - Remove messages that become empty after filtering (except final) - Respect Anthropic API rule: final assistant message CAN be empty ## Benefits - ✅ Handles root cause (aborted/incomplete responses) - ✅ User gets clear error messages instead of silent failure - ✅ Logs warnings for debugging - ✅ Never sends invalid payloads to Anthropic API - ✅ Backward compatible with already-cached invalid messages - ✅ Test coverage for all edge cases ## Test Coverage Added 7 new test cases: - hasValidContent validates empty/non-empty text - hasValidContent validates tool calls and results - createToolResultMessages handles undefined/null results 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]> * chore: Apply deno fmt formatting * chore: Fix lint warning - remove unused index parameter * refactor: Use validation instead of filtering for message history Instead of filtering empty text blocks from within messages and then removing messages that become empty, use the same hasValidContent() validation as fresh responses. This is cleaner and more consistent: Before: - Map over messages to filter empty text blocks from content arrays - Then filter out messages that became empty after text filtering - Complex multi-step transformation After: - Single validation pass using hasValidContent() - Remove messages that fail validation entirely - Same validation logic as fresh responses Benefits: - Simpler code (one filter instead of map + filter) - Consistent validation across fresh and cached messages - Don't try to "fix" invalid data, just remove invalid messages - Clearer intent: validate and remove, not transform and patch * refactor: Remove cached message validation - let legacy data fail Instead of validating and filtering cached messages, let the API reject invalid legacy data. This significantly simplifies the code: Before: - Validate all cached messages with hasValidContent() - Filter out invalid messages with logging - Special handling for tool messages and final assistant messages After: - Send messages as-is to API - If legacy invalid data exists, API will reject with clear error - User can start fresh conversation This is acceptable because: 1. Going forward, we never write invalid data (validated at source) 2. Legacy conversations with invalid data are rare edge cases 3. Clear API error is better than silent data manipulation 4. Much simpler code The prevention happens entirely at write time: - Fresh responses validated with hasValidContent() - Tool results always have valid output - Tool call/result counts validated --------- Co-authored-by: Claude <[email protected]> Co-authored-by: Ben Follington <[email protected]>
1 parent 310b072 commit 87552a9

File tree

2 files changed

+181
-11
lines changed

2 files changed

+181
-11
lines changed

packages/runner/src/builtins/llm-dialog.ts

Lines changed: 111 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -593,6 +593,30 @@ function extractToolCallParts(
593593
);
594594
}
595595

596+
/**
597+
* Validates whether message content is non-empty and valid for the Anthropic API.
598+
* Returns true if the content contains at least one non-empty text block or tool call.
599+
*/
600+
function hasValidContent(content: BuiltInLLMMessage["content"]): boolean {
601+
if (typeof content === "string") {
602+
return content.trim().length > 0;
603+
}
604+
605+
if (Array.isArray(content)) {
606+
return content.some((part) => {
607+
if (part.type === "tool-call" || part.type === "tool-result") {
608+
return true;
609+
}
610+
if (part.type === "text") {
611+
return (part as BuiltInLLMTextPart).text?.trim().length > 0;
612+
}
613+
return false;
614+
});
615+
}
616+
617+
return false;
618+
}
619+
596620
function buildAssistantMessage(
597621
content: BuiltInLLMMessage["content"],
598622
toolCallParts: BuiltInLLMToolCallPart[],
@@ -664,17 +688,29 @@ async function executeToolCalls(
664688
function createToolResultMessages(
665689
results: ToolCallExecutionResult[],
666690
): BuiltInLLMMessage[] {
667-
return results.map((toolResult) => ({
668-
role: "tool",
669-
content: [{
670-
type: "tool-result",
671-
toolCallId: toolResult.id,
672-
toolName: toolResult.toolName || "unknown",
673-
output: toolResult.error
674-
? { type: "error-text", value: toolResult.error }
675-
: toolResult.result,
676-
}],
677-
}));
691+
return results.map((toolResult) => {
692+
// Ensure output is never undefined/null - Anthropic API requires valid tool_result
693+
// for every tool_use, even if the tool returns nothing
694+
let output: any;
695+
if (toolResult.error) {
696+
output = { type: "error-text", value: toolResult.error };
697+
} else if (toolResult.result === undefined || toolResult.result === null) {
698+
// Tool returned nothing - use explicit null value
699+
output = { type: "json", value: null };
700+
} else {
701+
output = toolResult.result;
702+
}
703+
704+
return {
705+
role: "tool",
706+
content: [{
707+
type: "tool-result",
708+
toolCallId: toolResult.id,
709+
toolName: toolResult.toolName || "unknown",
710+
output,
711+
}],
712+
};
713+
});
678714
}
679715

680716
export const llmDialogTestHelpers = {
@@ -684,6 +720,7 @@ export const llmDialogTestHelpers = {
684720
extractToolCallParts,
685721
buildAssistantMessage,
686722
createToolResultMessages,
723+
hasValidContent,
687724
};
688725

689726
/**
@@ -1080,6 +1117,34 @@ async function startRequest(
10801117

10811118
resultPromise
10821119
.then(async (llmResult) => {
1120+
// Validate that the response has valid content
1121+
if (!hasValidContent(llmResult.content)) {
1122+
// LLM returned empty or invalid content (e.g., stream aborted mid-flight,
1123+
// or AI SDK bug with empty text blocks). Insert a proper error message
1124+
// instead of storing invalid content.
1125+
logger.warn("LLM returned invalid/empty content, adding error message");
1126+
const errorMessage = {
1127+
[ID]: { llmDialog: { message: cause, id: crypto.randomUUID() } },
1128+
role: "assistant",
1129+
content:
1130+
"I encountered an error generating a response. Please try again.",
1131+
} satisfies BuiltInLLMMessage & { [ID]: unknown };
1132+
1133+
await safelyPerformUpdate(
1134+
runtime,
1135+
pending,
1136+
internal,
1137+
requestId,
1138+
(tx) => {
1139+
messagesCell.withTx(tx).push(
1140+
errorMessage as Schema<typeof LLMMessageSchema>,
1141+
);
1142+
pending.withTx(tx).set(false);
1143+
},
1144+
);
1145+
return;
1146+
}
1147+
10831148
// Extract tool calls from content if it's an array
10841149
const hasToolCalls = Array.isArray(llmResult.content) &&
10851150
llmResult.content.some((part) => part.type === "tool-call");
@@ -1098,6 +1163,41 @@ async function startRequest(
10981163
toolCatalog,
10991164
toolCallParts,
11001165
);
1166+
1167+
// Validate that we have a result for every tool call with matching IDs
1168+
const toolCallIds = new Set(toolCallParts.map((p) => p.toolCallId));
1169+
const resultIds = new Set(toolResults.map((r) => r.id));
1170+
const mismatch = toolResults.length !== toolCallParts.length ||
1171+
!toolCallParts.every((p) => resultIds.has(p.toolCallId));
1172+
1173+
if (mismatch) {
1174+
logger.error(
1175+
`Tool execution mismatch: ${toolCallParts.length} calls [${
1176+
Array.from(toolCallIds)
1177+
}] but ${toolResults.length} results [${Array.from(resultIds)}]`,
1178+
);
1179+
// Add error message instead of invalid partial results
1180+
const errorMessage = {
1181+
[ID]: { llmDialog: { message: cause, id: crypto.randomUUID() } },
1182+
role: "assistant",
1183+
content: "Some tool calls failed to execute. Please try again.",
1184+
} satisfies BuiltInLLMMessage & { [ID]: unknown };
1185+
1186+
await safelyPerformUpdate(
1187+
runtime,
1188+
pending,
1189+
internal,
1190+
requestId,
1191+
(tx) => {
1192+
messagesCell.withTx(tx).push(
1193+
errorMessage as Schema<typeof LLMMessageSchema>,
1194+
);
1195+
pending.withTx(tx).set(false);
1196+
},
1197+
);
1198+
return;
1199+
}
1200+
11011201
const newMessages: BuiltInLLMMessage[] = [
11021202
assistantMessage,
11031203
...createToolResultMessages(toolResults),

packages/runner/test/llm-dialog-helpers.test.ts

Lines changed: 70 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@ const {
99
extractToolCallParts,
1010
buildAssistantMessage,
1111
createToolResultMessages,
12+
hasValidContent,
1213
} = llmDialogTestHelpers;
1314

1415
Deno.test("createCharmToolDefinitions slugifies charm names and returns tool metadata", () => {
@@ -112,3 +113,72 @@ Deno.test("createToolResultMessages converts execution results into tool message
112113
assertEquals(failurePart.toolName, "failing");
113114
assertEquals(failurePart.output, { type: "error-text", value: "boom" });
114115
});
116+
117+
Deno.test("hasValidContent returns true for non-empty text", () => {
118+
assert(hasValidContent("Hello"));
119+
assert(hasValidContent([{ type: "text", text: "Hello" }]));
120+
});
121+
122+
Deno.test("hasValidContent returns false for empty text", () => {
123+
assert(!hasValidContent(""));
124+
assert(!hasValidContent(" \n "));
125+
assert(!hasValidContent([{ type: "text", text: "" }]));
126+
assert(!hasValidContent([{ type: "text", text: " " }]));
127+
});
128+
129+
Deno.test("hasValidContent returns true for tool calls", () => {
130+
const content: BuiltInLLMMessage["content"] = [
131+
{ type: "text", text: "" },
132+
{ type: "tool-call", toolCallId: "1", toolName: "test", input: {} },
133+
];
134+
assert(hasValidContent(content));
135+
});
136+
137+
Deno.test("hasValidContent returns true for tool results", () => {
138+
const content: BuiltInLLMMessage["content"] = [
139+
{
140+
type: "tool-result",
141+
toolCallId: "1",
142+
toolName: "test",
143+
output: { type: "json", value: null },
144+
},
145+
];
146+
assert(hasValidContent(content));
147+
});
148+
149+
Deno.test("hasValidContent returns false for only empty text parts", () => {
150+
const content: BuiltInLLMMessage["content"] = [
151+
{ type: "text", text: "" },
152+
{ type: "text", text: " " },
153+
];
154+
assert(!hasValidContent(content));
155+
});
156+
157+
Deno.test("createToolResultMessages handles undefined result with explicit null", () => {
158+
const messages = createToolResultMessages([{
159+
id: "call-1",
160+
toolName: "empty",
161+
result: undefined,
162+
}]);
163+
164+
assertEquals(messages.length, 1);
165+
assertEquals(messages[0].role, "tool");
166+
assertEquals(messages[0].content?.[0], {
167+
type: "tool-result",
168+
toolCallId: "call-1",
169+
toolName: "empty",
170+
output: { type: "json", value: null },
171+
});
172+
});
173+
174+
Deno.test("createToolResultMessages handles null result with explicit null", () => {
175+
const messages = createToolResultMessages([{
176+
id: "call-1",
177+
toolName: "empty",
178+
result: null,
179+
}]);
180+
181+
assertEquals(messages.length, 1);
182+
const outputPart = messages[0].content?.[0] as any;
183+
assertEquals(outputPart.output, { type: "json", value: null });
184+
});

0 commit comments

Comments
 (0)