fix: add security_risk and summary to tool examples for non-native function calling#2251
Conversation
Note on Failing API Breakage ChecksThe failing API breakage checks are not related to this PR's changes. The failure is caused by PR #2243 ( My changes only modify:
Neither of these affect the public SDK API surface. For |
…nction calling This fixes issue OpenHands#2175 where models using prompt-based function calling (native_tool_calling=False) would fail to provide the security_risk parameter because the examples in the system prompt didn't show it. Changes: - Add security_risk parameter to all tool examples in TOOL_EXAMPLES dict - Add summary parameter to all tool examples for transparency - Update in_context_learning_example.j2 template with the same parameters The security_risk parameter is required when LLMSecurityAnalyzer is active, and the summary parameter is always recommended for action transparency. Fixes: OpenHands#2175 Co-authored-by: openhands <openhands@all-hands.dev>
2b9eb62 to
975265e
Compare
|
@all-hands-bot roast it |
enyst
left a comment
There was a problem hiding this comment.
Hey, thank you for the contribution! Yes… yes I think missing security risk from prompts was causing models to miss it in responses.
I’m curious though, is security risk always supposed to be present? It’s worth making sure what happens when the security analyzer is disabled.
On another note, about the summary, I wonder if including it affects performance, it might be worth to run a mini-evaluation. Maybe it’s too small and obvious 🤷♂️ but I don’t know
|
@OpenHands Do a /codereview-roasted on this PR, after you read all comments, and try to understand deeply how the code works. Post your review directly on github as a comment. |
|
I'm on it! enyst can track my progress at all-hands.dev |
|
Taste rating: 🟡 Acceptable (but sloppy around the exact bug you’re claiming to fix) You’re fixing a real problem (prompt-mocked tool calling models missing [CRITICAL ISSUES]
[IMPROVEMENT OPPORTUNITIES]
[TESTING GAPS]
Verdict: ✅ likely worth merging after the missing Key insight: If you’re going to “teach the model” via examples, leaving one default tool with the old broken example defeats the entire point. |
This comment was marked as duplicate.
This comment was marked as duplicate.
…ples' into fix/pr-2251-review-feedback
… example Addresses review feedback: - Add security_risk parameter to task_tracker 'view' and 'plan' examples (task_tracker has readOnlyHint=False so requires security_risk) - Add required 'message' parameter to finish example in in_context_learning_example.j2 (FinishAction.message is a required field) Co-authored-by: openhands <openhands@all-hands.dev>
|
Addressed the review feedback: Fixes
Note on
|
|
Please don't worry about updating, nor about the checks (fixing those separately, they're only unable to comment at the moment) |
Summary
This PR fixes issue #2175 where models using prompt-based function calling (
native_tool_calling=False) would fail to provide thesecurity_riskparameter because the examples in the system prompt didn't demonstrate it.Problem
When using weaker models (like qwen2.5-coder:7b) that don't support native function calling, the SDK falls back to prompt-based function calling. This injects tool definitions and examples directly into the system prompt. However, these examples were missing the
security_riskandsummaryparameters, causing validation errors:The model didn't know to include these fields because the examples didn't show them.
Solution
Added
security_riskandsummaryparameters to all tool examples in:openhands-sdk/openhands/sdk/llm/mixins/fn_call_converter.py(TOOL_EXAMPLES dict)openhands-sdk/openhands/sdk/agent/prompts/in_context_learning_example.j2(template)Now the examples teach models to include these parameters:
Details
security_riskis required whenLLMSecurityAnalyzeris configuredsummaryis optional but recommended for action transparencyadd_security_risk_prediction=TrueTesting
Fixes #2175