Skip to content

fix(sdk): add security_risk to prompt-based tool calling example#2746

Open
VascoSch92 wants to merge 2 commits intomainfrom
fix/2740-security-risk-in-prompt-examples
Open

fix(sdk): add security_risk to prompt-based tool calling example#2746
VascoSch92 wants to merge 2 commits intomainfrom
fix/2740-security-risk-in-prompt-examples

Conversation

@VascoSch92
Copy link
Copy Markdown
Contributor

@VascoSch92 VascoSch92 commented Apr 7, 2026

Summary

Add security_risk and summary parameters to the generic tool call example in system_message_suffix_TEMPLATE. This ensures that smaller models using prompt-based tool calling (native_tool_calling=False) see these parameters in the format instructions, helping them learn to include them in their tool calls.

Problem

When using native_tool_calling=False for weaker/smaller models (e.g., qwen2.5-coder:7b), the generic tool call example in the system prompt showed:

<function=example_function_name>
<parameter=example_parameter_1>value_1</parameter>
<parameter=example_parameter_2>...</parameter>
</function>

Without security_risk in this example, smaller models consistently omit it from their tool calls, causing validation errors when LLMSecurityAnalyzer is active.

Solution

Updated the template to include both security_risk and summary parameters:

<function=example_function_name>
<parameter=example_parameter_1>value_1</parameter>
<parameter=example_parameter_2>...</parameter>
<parameter=security_risk>LOW</parameter>
<parameter=summary>Brief description of action</parameter>
</function>

Testing

  • Added regression test test_system_message_suffix_template_includes_security_risk to prevent future regressions
  • All 28 existing tests continue to pass

Fixes #2740


This PR was created by an AI assistant (OpenHands) on behalf of the user.

@VascoSch92 can click here to continue refining the PR


Agent Server images for this PR

GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant Architectures Base Image Docs / Tags
java amd64, arm64 eclipse-temurin:17-jdk Link
python amd64, arm64 nikolaik/python-nodejs:python3.13-nodejs22-slim Link
golang amd64, arm64 golang:1.21-bookworm Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:58e4b8e-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-58e4b8e-python \
  ghcr.io/openhands/agent-server:58e4b8e-python

All tags pushed for this build

ghcr.io/openhands/agent-server:58e4b8e-golang-amd64
ghcr.io/openhands/agent-server:58e4b8e-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:58e4b8e-golang-arm64
ghcr.io/openhands/agent-server:58e4b8e-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:58e4b8e-java-amd64
ghcr.io/openhands/agent-server:58e4b8e-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:58e4b8e-java-arm64
ghcr.io/openhands/agent-server:58e4b8e-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:58e4b8e-python-amd64
ghcr.io/openhands/agent-server:58e4b8e-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-amd64
ghcr.io/openhands/agent-server:58e4b8e-python-arm64
ghcr.io/openhands/agent-server:58e4b8e-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-arm64
ghcr.io/openhands/agent-server:58e4b8e-golang
ghcr.io/openhands/agent-server:58e4b8e-java
ghcr.io/openhands/agent-server:58e4b8e-python

About Multi-Architecture Support

  • Each variant tag (e.g., 58e4b8e-python) is a multi-arch manifest supporting both amd64 and arm64
  • Docker automatically pulls the correct architecture for your platform
  • Individual architecture tags (e.g., 58e4b8e-python-amd64) are also available if needed

Add security_risk and summary parameters to the generic tool call example
in system_message_suffix_TEMPLATE. This ensures that smaller models using
prompt-based tool calling (native_tool_calling=False) see these parameters
in the format instructions, helping them learn to include them in their
tool calls.

Fixes #2740

Co-authored-by: openhands <openhands@all-hands.dev>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 7, 2026

Python API breakage checks — ✅ PASSED

Result:PASSED

Action log

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 7, 2026

REST API breakage checks (OpenAPI) — ✅ PASSED

Result:PASSED

Action log

@VascoSch92 VascoSch92 requested a review from all-hands-bot April 7, 2026 16:17
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 7, 2026

Coverage

Coverage Report •
FileStmtsMissCoverMissing
openhands-sdk/openhands/sdk/llm
   llm.py5167884%466, 485, 541, 800, 909, 911–912, 940, 986, 997–999, 1003–1007, 1015–1017, 1027–1029, 1032–1033, 1037, 1039–1040, 1042, 1266–1267, 1464–1465, 1474, 1487, 1489–1494, 1496–1513, 1516–1520, 1522–1523, 1529–1538, 1593, 1595
openhands-sdk/openhands/sdk/llm/mixins
   fn_call_converter.py42211572%95, 97, 99, 101, 106–107, 132, 140, 144–145, 148, 151–153, 169, 198–200, 203, 210, 226, 228, 256, 264–265, 348–350, 352, 354, 375–377, 383, 405, 431–432, 440–443, 445, 447, 469, 478, 486, 534–537, 541–544, 556, 560, 575, 604, 606, 640, 654, 656–658, 660, 663, 689, 693, 719, 726–727, 730–731, 735–736, 742–743, 746, 765–768, 772–773, 778–779, 784, 817–818, 850–851, 857, 871, 883, 885–886, 889–892, 894–895, 901–903, 905–906, 908, 910, 914, 916, 921, 923–924, 927
   non_native_fc.py41392%73, 84, 102
TOTAL21984631971% 

Copy link
Copy Markdown
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 Good taste - Simple, elegant fix that addresses the root cause.

⚠️ Eval Risk: This PR changes a prompt template, which affects agent behavior. Per repo guidelines, I cannot approve prompt template changes without lightweight eval validation. Flagging for human maintainer to run evals before merge.

Verdict: ✅ Worth merging after eval confirmation - core logic is sound, fix is minimal and well-tested.

@VascoSch92 VascoSch92 requested a review from enyst April 7, 2026 16:19
@VascoSch92 VascoSch92 marked this pull request as ready for review April 7, 2026 16:19
Copy link
Copy Markdown
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 Good taste - Simple, elegant fix that addresses the root cause.

⚠️ Eval Risk: This PR changes a prompt template, which affects agent behavior. Per repo guidelines, I cannot approve prompt template changes without lightweight eval validation. Flagging for human maintainer to run evals before merge.

Verdict: ✅ Worth merging after eval confirmation - core logic is sound, fix is minimal and well-tested.

@VascoSch92 VascoSch92 requested a review from enyst April 8, 2026 06:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug(sdk): Prompt-based tool calling examples missing security_risk parameter

4 participants