fix(llm-dialog): Filter empty text content blocks to prevent API errors (commontoolsinc#2034)

jkomoros · claude · bfollington · web-flow · commit 87552a99bba4 · 2025-11-09T11:20:30.000-08:00
* fix(llm-dialog): Filter empty text content blocks to prevent API errors ## Problem After enabling tool call caching (d394ec8, Nov 6), users encountered 'text content blocks must be non-empty' errors when: 1. Running demo-setup Execute button multiple times 2. Using omnibot after tool calls 3. Any cached conversation with tool calls Error from Anthropic API: ``` AI_APICallError: messages: text content blocks must be non-empty ``` ## Root Cause When Claude responds with tool calls, it can include empty text parts: ```json { "role": "assistant", "content": [ {"type": "text", "text": ""}, // Empty! {"type": "tool_use", ...} ] } ``` Before tool call caching was enabled, these messages weren't cached so the issue didn't surface. After caching was enabled: 1. First request: LLM returns empty text + tool call 2. Tool results added to conversation 3. Response cached with empty text parts 4. Next request: Cached messages (with empty text) sent to API 5. API rejects: "text content blocks must be non-empty" Anthropic API recently became stricter about rejecting empty text blocks. ## The Fix **Two-point defense:** 1. **buildAssistantMessage (line 611-613)**: Filter out empty text parts when constructing assistant messages from LLM responses 2. **Request construction (lines 1069-1082)**: Filter all messages before sending to API to remove any empty text content that may have been cached Both filters check: - Part is type "text" - Part has non-null text field - Text is non-empty after trimming ## Why Both Filters Are Needed - Filter #1: Prevents storing empty text parts initially - Filter #2: Defense in depth for cached messages that already have empty text ## Testing Verified fix resolves: - ✅ demo-setup Execute button works on first and subsequent runs - ✅ demo-setup Reset + Execute works without errors - ✅ Tool calls execute successfully - ✅ Conversation continues after tool calls - ✅ No API errors in logs ## Related - Introduced by: d394ec8 "Allow caching of tool calls" - Workaround used in patterns: Cache-busting with Date.now() timestamps - Those workarounds can now be removed (though they're harmless) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * fix(llm-dialog): Handle invalid responses at source instead of just filtering ## Problem with Previous Fix The previous fix filtered empty text blocks, but could create invalid states: - Message with only empty text: `content: [{type:"text", text:""}]` - After filtering: `content: []` ← Invalid (except for final message) ## Root Cause Vercel AI SDK v5.x bug: returns empty text blocks before tool calls. Critical scenario: stream aborts after empty text but BEFORE tool calls. Result: 1. Stream sends: {type:"text", text:""} 2. Stream crashes before tool-call events 3. We store: content: [{type:"text", text:""}] (no tool calls!) 4. Gets cached 5. Next request: filter creates content: [] 6. API rejects: "messages must have non-empty content" ## New Solution ### 1. Validate Fresh Responses Check if response has valid content before storing: - If invalid (empty/only whitespace), insert proper error message - Provides user feedback: "I encountered an error generating a response..." - Maintains valid message history ### 2. Enhanced Cached Message Filtering - Filter empty text blocks from cached messages - Remove messages that become empty after filtering - Respect API rule: final assistant message CAN be empty ### 3. Remove Unnecessary Filtering Removed filtering from buildAssistantMessage since we now validate upstream. ## Benefits - Handles root cause (aborted/incomplete responses) - User gets clear error message instead of silent failure - Logs warnings for debugging - Never sends invalid payloads to Anthropic API - Backward compatible with already-cached invalid messages 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * fix(llm-dialog): Handle invalid responses and tool result edge cases ## Problems Fixed 1. **Empty/invalid LLM responses**: Stream aborts after empty text but before tool calls 2. **Tool results with no output**: Tools returning undefined/null broke API validation 3. **Tool result filtering**: Cached tool result messages could be removed incorrectly 4. **Mismatched tool calls/results**: If executeToolCalls fails partially ## Root Causes **Vercel AI SDK bug**: Returns empty text blocks before tool calls: ```json { "content": [ { "type": "text", "text": "" }, // ← SDK bug { "type": "tool-call", ... } ] } ``` **Critical scenario**: Stream crashes after empty text but BEFORE tool calls: - Receive: `content: [{type:"text", text:""}]` (no tool calls!) - Store as-is, gets cached - Next request: Filter creates `content: []` ← Invalid! - API rejects: "messages must have non-empty content" **Tool result validation**: Anthropic API **requires** tool_result for every tool_use: - If tool returns undefined/null, output field was undefined - Error: "tool_use ids were found without tool_result blocks" ## Solution ### 1. Validate Fresh Responses Check if response has valid content before storing: ```typescript function hasValidContent(content): boolean { // Returns true only if content has non-empty text OR tool calls OR tool results } if (!hasValidContent(llmResult.content)) { // Insert proper error message instead of invalid content content: "I encountered an error generating a response. Please try again." } ``` ### 2. Ensure Tool Results Always Valid Convert undefined/null results to explicit null value: ```typescript if (toolResult.result === undefined || toolResult.result === null) { output = { type: "json", value: null }; } ``` ### 3. Never Filter Tool Messages Tool result messages must always be preserved: ```typescript if (msg.role === "tool") return true; // Never filter ``` ### 4. Validate Tool Call/Result Match Ensure every tool call has a corresponding result: ```typescript if (toolResults.length !== toolCallParts.length) { // Log detailed error, insert error message } ``` ### 5. Enhanced Cached Message Filtering - Filter empty text blocks from cached messages (backward compat) - Remove messages that become empty after filtering (except final) - Respect Anthropic API rule: final assistant message CAN be empty ## Benefits - ✅ Handles root cause (aborted/incomplete responses) - ✅ User gets clear error messages instead of silent failure - ✅ Logs warnings for debugging - ✅ Never sends invalid payloads to Anthropic API - ✅ Backward compatible with already-cached invalid messages - ✅ Test coverage for all edge cases ## Test Coverage Added 7 new test cases: - hasValidContent validates empty/non-empty text - hasValidContent validates tool calls and results - createToolResultMessages handles undefined/null results 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * chore: Apply deno fmt formatting * chore: Fix lint warning - remove unused index parameter * refactor: Use validation instead of filtering for message history Instead of filtering empty text blocks from within messages and then removing messages that become empty, use the same hasValidContent() validation as fresh responses. This is cleaner and more consistent: Before: - Map over messages to filter empty text blocks from content arrays - Then filter out messages that became empty after text filtering - Complex multi-step transformation After: - Single validation pass using hasValidContent() - Remove messages that fail validation entirely - Same validation logic as fresh responses Benefits: - Simpler code (one filter instead of map + filter) - Consistent validation across fresh and cached messages - Don't try to "fix" invalid data, just remove invalid messages - Clearer intent: validate and remove, not transform and patch * refactor: Remove cached message validation - let legacy data fail Instead of validating and filtering cached messages, let the API reject invalid legacy data. This significantly simplifies the code: Before: - Validate all cached messages with hasValidContent() - Filter out invalid messages with logging - Special handling for tool messages and final assistant messages After: - Send messages as-is to API - If legacy invalid data exists, API will reject with clear error - User can start fresh conversation This is acceptable because: 1. Going forward, we never write invalid data (validated at source) 2. Legacy conversations with invalid data are rare edge cases 3. Clear API error is better than silent data manipulation 4. Much simpler code The prevention happens entirely at write time: - Fresh responses validated with hasValidContent() - Tool results always have valid output - Tool call/result counts validated --------- Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: Ben Follington <5009316+bfollington@users.noreply.github.com>
diff --git a/packages/runner/src/builtins/llm-dialog.ts b/packages/runner/src/builtins/llm-dialog.ts
@@ -593,6 +593,30 @@ function extractToolCallParts(
   );
 }
 
+/**
+ * Validates whether message content is non-empty and valid for the Anthropic API.
+ * Returns true if the content contains at least one non-empty text block or tool call.
+ */
+function hasValidContent(content: BuiltInLLMMessage["content"]): boolean {
+  if (typeof content === "string") {
+    return content.trim().length > 0;
+  }
+
+  if (Array.isArray(content)) {
+    return content.some((part) => {
+      if (part.type === "tool-call" || part.type === "tool-result") {
+        return true;
+      }
+      if (part.type === "text") {
+        return (part as BuiltInLLMTextPart).text?.trim().length > 0;
+      }
+      return false;
+    });
+  }
+
+  return false;
+}
+
 function buildAssistantMessage(
   content: BuiltInLLMMessage["content"],
   toolCallParts: BuiltInLLMToolCallPart[],
@@ -664,17 +688,29 @@ async function executeToolCalls(
 function createToolResultMessages(
   results: ToolCallExecutionResult[],
 ): BuiltInLLMMessage[] {
-  return results.map((toolResult) => ({
-    role: "tool",
-    content: [{
-      type: "tool-result",
-      toolCallId: toolResult.id,
-      toolName: toolResult.toolName || "unknown",
-      output: toolResult.error
-        ? { type: "error-text", value: toolResult.error }
-        : toolResult.result,
-    }],
-  }));
+  return results.map((toolResult) => {
+    // Ensure output is never undefined/null - Anthropic API requires valid tool_result
+    // for every tool_use, even if the tool returns nothing
+    let output: any;
+    if (toolResult.error) {
+      output = { type: "error-text", value: toolResult.error };
+    } else if (toolResult.result === undefined || toolResult.result === null) {
+      // Tool returned nothing - use explicit null value
+      output = { type: "json", value: null };
+    } else {
+      output = toolResult.result;
+    }
+
+    return {
+      role: "tool",
+      content: [{
+        type: "tool-result",
+        toolCallId: toolResult.id,
+        toolName: toolResult.toolName || "unknown",
+        output,
+      }],
+    };
+  });
 }
 
 export const llmDialogTestHelpers = {
@@ -684,6 +720,7 @@ export const llmDialogTestHelpers = {
   extractToolCallParts,
   buildAssistantMessage,
   createToolResultMessages,
+  hasValidContent,
 };
 
 /**
@@ -1080,6 +1117,34 @@ async function startRequest(
 
   resultPromise
     .then(async (llmResult) => {
+      // Validate that the response has valid content
+      if (!hasValidContent(llmResult.content)) {
+        // LLM returned empty or invalid content (e.g., stream aborted mid-flight,
+        // or AI SDK bug with empty text blocks). Insert a proper error message
+        // instead of storing invalid content.
+        logger.warn("LLM returned invalid/empty content, adding error message");
+        const errorMessage = {
+          [ID]: { llmDialog: { message: cause, id: crypto.randomUUID() } },
+          role: "assistant",
+          content:
+            "I encountered an error generating a response. Please try again.",
+        } satisfies BuiltInLLMMessage & { [ID]: unknown };
+
+        await safelyPerformUpdate(
+          runtime,
+          pending,
+          internal,
+          requestId,
+          (tx) => {
+            messagesCell.withTx(tx).push(
+              errorMessage as Schema<typeof LLMMessageSchema>,
+            );
+            pending.withTx(tx).set(false);
+          },
+        );
+        return;
+      }
+
       // Extract tool calls from content if it's an array
       const hasToolCalls = Array.isArray(llmResult.content) &&
         llmResult.content.some((part) => part.type === "tool-call");
@@ -1098,6 +1163,41 @@ async function startRequest(
             toolCatalog,
             toolCallParts,
           );
+
+          // Validate that we have a result for every tool call with matching IDs
+          const toolCallIds = new Set(toolCallParts.map((p) => p.toolCallId));
+          const resultIds = new Set(toolResults.map((r) => r.id));
+          const mismatch = toolResults.length !== toolCallParts.length ||
+            !toolCallParts.every((p) => resultIds.has(p.toolCallId));
+
+          if (mismatch) {
+            logger.error(
+              `Tool execution mismatch: ${toolCallParts.length} calls [${
+                Array.from(toolCallIds)
+              }] but ${toolResults.length} results [${Array.from(resultIds)}]`,
+            );
+            // Add error message instead of invalid partial results
+            const errorMessage = {
+              [ID]: { llmDialog: { message: cause, id: crypto.randomUUID() } },
+              role: "assistant",
+              content: "Some tool calls failed to execute. Please try again.",
+            } satisfies BuiltInLLMMessage & { [ID]: unknown };
+
+            await safelyPerformUpdate(
+              runtime,
+              pending,
+              internal,
+              requestId,
+              (tx) => {
+                messagesCell.withTx(tx).push(
+                  errorMessage as Schema<typeof LLMMessageSchema>,
+                );
+                pending.withTx(tx).set(false);
+              },
+            );
+            return;
+          }
+
           const newMessages: BuiltInLLMMessage[] = [
             assistantMessage,
             ...createToolResultMessages(toolResults),
diff --git a/packages/runner/test/llm-dialog-helpers.test.ts b/packages/runner/test/llm-dialog-helpers.test.ts
@@ -9,6 +9,7 @@ const {
   extractToolCallParts,
   buildAssistantMessage,
   createToolResultMessages,
+  hasValidContent,
 } = llmDialogTestHelpers;
 
 Deno.test("createCharmToolDefinitions slugifies charm names and returns tool metadata", () => {
@@ -112,3 +113,72 @@ Deno.test("createToolResultMessages converts execution results into tool message
   assertEquals(failurePart.toolName, "failing");
   assertEquals(failurePart.output, { type: "error-text", value: "boom" });
 });
+
+Deno.test("hasValidContent returns true for non-empty text", () => {
+  assert(hasValidContent("Hello"));
+  assert(hasValidContent([{ type: "text", text: "Hello" }]));
+});
+
+Deno.test("hasValidContent returns false for empty text", () => {
+  assert(!hasValidContent(""));
+  assert(!hasValidContent("   \n  "));
+  assert(!hasValidContent([{ type: "text", text: "" }]));
+  assert(!hasValidContent([{ type: "text", text: "  " }]));
+});
+
+Deno.test("hasValidContent returns true for tool calls", () => {
+  const content: BuiltInLLMMessage["content"] = [
+    { type: "text", text: "" },
+    { type: "tool-call", toolCallId: "1", toolName: "test", input: {} },
+  ];
+  assert(hasValidContent(content));
+});
+
+Deno.test("hasValidContent returns true for tool results", () => {
+  const content: BuiltInLLMMessage["content"] = [
+    {
+      type: "tool-result",
+      toolCallId: "1",
+      toolName: "test",
+      output: { type: "json", value: null },
+    },
+  ];
+  assert(hasValidContent(content));
+});
+
+Deno.test("hasValidContent returns false for only empty text parts", () => {
+  const content: BuiltInLLMMessage["content"] = [
+    { type: "text", text: "" },
+    { type: "text", text: "  " },
+  ];
+  assert(!hasValidContent(content));
+});
+
+Deno.test("createToolResultMessages handles undefined result with explicit null", () => {
+  const messages = createToolResultMessages([{
+    id: "call-1",
+    toolName: "empty",
+    result: undefined,
+  }]);
+
+  assertEquals(messages.length, 1);
+  assertEquals(messages[0].role, "tool");
+  assertEquals(messages[0].content?.[0], {
+    type: "tool-result",
+    toolCallId: "call-1",
+    toolName: "empty",
+    output: { type: "json", value: null },
+  });
+});
+
+Deno.test("createToolResultMessages handles null result with explicit null", () => {
+  const messages = createToolResultMessages([{
+    id: "call-1",
+    toolName: "empty",
+    result: null,
+  }]);
+
+  assertEquals(messages.length, 1);
+  const outputPart = messages[0].content?.[0] as any;
+  assertEquals(outputPart.output, { type: "json", value: null });
+});