Skip to content

fix(sdk): Use concise error message for tool validation errors#2748

Merged
VascoSch92 merged 3 commits intomainfrom
fix/concise-validation-error-message
Apr 7, 2026
Merged

fix(sdk): Use concise error message for tool validation errors#2748
VascoSch92 merged 3 commits intomainfrom
fix/concise-validation-error-message

Conversation

@VascoSch92
Copy link
Copy Markdown
Contributor

@VascoSch92 VascoSch92 commented Apr 7, 2026

Summary

Fixes #2741

When tool validation fails, the error message now includes only parameter names (not values) to avoid wasting LLM context window on large payloads like file_editor's old_str/new_str.

Changes

Before

Error validating args {"command": "view", "path": "...", "old_str": "<very long string>", "new_str": "<very long string>"} for tool 'file_editor': Failed to provide security_risk field...

After

Error validating tool 'file_editor': Failed to provide security_risk field... Parameters provided: ['command', 'path', 'old_str', 'new_str']

For unparseable JSON:

Error validating tool 'file_editor': Expecting value... Arguments: unparseable JSON

Benefits

  1. Saves LLM context window - Error messages no longer include potentially huge argument values
  2. Improves LLM self-correction - Signal is no longer lost in noise
  3. Still informative - The LLM can see which parameters were provided

Testing

Added two tests:

  • test_validation_error_shows_keys_not_values - Verifies error shows keys but not large values
  • test_unparseable_json_error_message - Verifies unparseable JSON is handled gracefully

This PR was created by an AI agent (OpenHands) on behalf of the user.

@VascoSch92 can click here to continue refining the PR


Agent Server images for this PR

GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant Architectures Base Image Docs / Tags
java amd64, arm64 eclipse-temurin:17-jdk Link
python amd64, arm64 nikolaik/python-nodejs:python3.13-nodejs22-slim Link
golang amd64, arm64 golang:1.21-bookworm Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:20379d9-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-20379d9-python \
  ghcr.io/openhands/agent-server:20379d9-python

All tags pushed for this build

ghcr.io/openhands/agent-server:20379d9-golang-amd64
ghcr.io/openhands/agent-server:20379d9-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:20379d9-golang-arm64
ghcr.io/openhands/agent-server:20379d9-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:20379d9-java-amd64
ghcr.io/openhands/agent-server:20379d9-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:20379d9-java-arm64
ghcr.io/openhands/agent-server:20379d9-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:20379d9-python-amd64
ghcr.io/openhands/agent-server:20379d9-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-amd64
ghcr.io/openhands/agent-server:20379d9-python-arm64
ghcr.io/openhands/agent-server:20379d9-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-arm64
ghcr.io/openhands/agent-server:20379d9-golang
ghcr.io/openhands/agent-server:20379d9-java
ghcr.io/openhands/agent-server:20379d9-python

About Multi-Architecture Support

  • Each variant tag (e.g., 20379d9-python) is a multi-arch manifest supporting both amd64 and arm64
  • Docker automatically pulls the correct architecture for your platform
  • Individual architecture tags (e.g., 20379d9-python-amd64) are also available if needed

@VascoSch92 VascoSch92 added the bug Something isn't working label Apr 7, 2026 — with OpenHands AI
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 7, 2026

Python API breakage checks — ✅ PASSED

Result:PASSED

Action log

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 7, 2026

REST API breakage checks (OpenAPI) — ✅ PASSED

Result:PASSED

Action log

When tool validation fails, the error message now includes only
parameter names (not values) to avoid wasting LLM context window
on large payloads like file_editor's old_str/new_str.

Before:
  Error validating args {"command": "view", "old_str": "<huge value>"...} for tool 'file_editor': ...

After:
  Error validating tool 'file_editor': ... Parameters provided: ['command', 'path', 'old_str']

For unparseable JSON, the message indicates:
  Error validating tool 'file_editor': ... Arguments: unparseable JSON

Fixes #2741

Co-authored-by: openhands <openhands@all-hands.dev>
@VascoSch92 VascoSch92 force-pushed the fix/concise-validation-error-message branch from 915a95a to f4a8219 Compare April 7, 2026 16:24
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 7, 2026

Coverage

Coverage Report •
FileStmtsMissCoverMissing
openhands-sdk/openhands/sdk/agent
   agent.py3212492%92, 273, 277, 476–478, 480, 510–511, 518–519, 575, 628–629, 631, 698, 926–927, 929, 958, 966–967, 1001, 1008
TOTAL21968632571% 

@VascoSch92 VascoSch92 requested a review from all-hands-bot April 7, 2026 16:24
Copy link
Copy Markdown
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taste Rating: 🟢 Good taste - Clean solution to a real problem

This is solid engineering. You're solving a real problem (LLM context waste on large tool arguments like file_editor's old_str/new_str) with a straightforward solution: show parameter names, not values.

What's Good:

  • Pragmatic: Addresses actual context waste we see in production
  • Simple: No unnecessary complexity - just extract keys instead of showing full values
  • Well-tested: Tests verify actual error message content, not just mocks
  • Clean implementation: Clear variable names, appropriate comments, follows repo patterns

Eval-Risk Flag (Required by Repo Policy):

Per the repo's review guidelines, I cannot approve this PR because it changes tool execution error messages visible to the LLM, which falls under the eval-risk category:

"Do NOT submit an APPROVE review when the PR changes agent behavior or anything that could plausibly affect benchmark/evaluation performance. Examples include: prompt templates, tool calling/execution, ..."

While this change should improve LLM self-correction by reducing noise (as you correctly note), changing LLM-facing error messages can affect agent behavior in subtle ways that need empirical validation.

Required Next Step:

Run lightweight evals (e.g., subset of SWE-bench or GAIA) to verify this maintains or improves agent performance. The new error format is cleaner, but we need data to confirm LLMs self-correct as well as or better than before.

VERDICT:Worth merging after eval validation

KEY INSIGHT: Clean implementation solving a real problem - just needs empirical validation that the concise error format helps LLMs self-correct effectively.

@VascoSch92 VascoSch92 marked this pull request as ready for review April 7, 2026 16:29
@VascoSch92 VascoSch92 requested a review from enyst April 7, 2026 16:29
Copy link
Copy Markdown
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 Good taste - Clean solution to a real problem. Minimal code, clear intent, solves actual LLM context waste. One minor edge case to fix with empty dict handling. 👍

VascoSch92 and others added 2 commits April 7, 2026 18:34
Co-authored-by: OpenHands Bot <contact@all-hands.dev>
The previous commit introduced broken indentation in the except block,
causing a SyntaxError (unmatched ')') that broke all CI checks.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Collaborator

@enyst enyst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧁

@VascoSch92 VascoSch92 merged commit 07bd007 into main Apr 7, 2026
30 checks passed
@VascoSch92 VascoSch92 deleted the fix/concise-validation-error-message branch April 7, 2026 18:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug(sdk): Verbose error message when security_risk field is missing includes full tool arguments

4 participants