Skip to content

Conversation

@ryan-lempka
Copy link
Contributor

@ryan-lempka ryan-lempka commented Nov 7, 2025

Overview

Parse tool call arguments JSON strings into structured JSON before Jinja rendering to fix double-encoding and enable iteration.

Details

  • Normalize:
    • messages[*].tool_calls[*].function.arguments
    • messages[*].function_call.arguments
  • Normalization isolated from messages() and only applied immediately before Jinja rendering

Where to start

  • oai.rs: normalize_tool_arguments_in_messages()

Tests

  • |tojson no longer double-encodes.
  • |items iteration works.
  • Legacy path parsed.
  • Malformed JSON passes through unchanged.

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

  • Fixes GitHub issue: #4161

Summary by CodeRabbit

  • Improvements

    • Improved consistency and reliability in processing AI tool function arguments within chat completion requests. Enhanced handling ensures proper normalization of tool arguments across both current and legacy formats, leading to more predictable behavior.
  • Tests

    • Added comprehensive test coverage for tool argument normalization across multiple scenarios, including edge cases.

@ryan-lempka ryan-lempka self-assigned this Nov 7, 2025
@ryan-lempka ryan-lempka requested a review from a team as a code owner November 7, 2025 05:00
@github-actions github-actions bot added the fix label Nov 7, 2025
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 7, 2025

Walkthrough

Adds a new helper function normalize_tool_arguments_in_messages to deserialize tool argument strings into JSON objects within message processing. The function is integrated into NvCreateChatCompletionRequest::messages and includes comprehensive tests for normalization behavior, legacy function calls, and edge cases.

Changes

Cohort / File(s) Summary
Tool argument normalization
lib/llm/src/preprocessor/prompt/template/oai.rs
New helper function normalize_tool_arguments_in_messages that traverses messages and deserializes tool argument strings into JSON objects/arrays for both tool_calls.function.arguments and function_call.arguments. Integrated into request message processing with test coverage for deserialization, iteration, legacy handling, and malformed JSON passthrough.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

  • Deserialize logic correctness: Verify string-to-JSON conversion handles all supported formats and edge cases
  • Error handling: Confirm malformed JSON is handled gracefully per design
  • Integration point: Review how normalization fits into existing message processing pipeline
  • Test coverage: Validate test cases cover tool_calls, function_call, and error scenarios

Poem

🐰 Hops through messages with delight,
Arguments once tangled, now shining bright!
From strings to JSON, a transformation so neat,
Tool calls unwrapped—the preprocessing's complete!

Pre-merge checks

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'fix: deserialize tool call args' clearly and concisely describes the main change - deserializing tool call arguments from JSON strings.
Description check ✅ Passed PR description covers all required sections: overview clearly states the purpose, details specify exact paths being modified, starting points identified, tests outlined, and related issue linked with action keyword.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
lib/llm/src/preprocessor/prompt/template/oai.rs (2)

124-156: LGTM! Clean implementation with solid error handling.

The function correctly normalizes JSON string arguments into parsed objects for both current (tool_calls) and legacy (function_call) formats. The graceful handling of malformed JSON (leaving it unchanged) aligns with the PR objectives.

Minor style suggestion: The Result::Ok pattern can be simplified to just Ok:

-                        if let Result::Ok(parsed) = serde_json::from_str(s) {
+                        if let Ok(parsed) = serde_json::from_str(s) {
                             *args = parsed;
                         }

Apply the same simplification at line 149.


740-847: Good test coverage for the primary scenarios.

The tests validate the key behaviors: prevention of double-encoding, support for iteration, legacy format handling, and malformed JSON passthrough. The order-insensitive assertion at line 804 is a nice touch.

Consider adding a few edge-case tests to strengthen coverage:

  1. Arguments already parsed (not strings): Verify that arguments already in object form are left unchanged.
  2. Multiple tool_calls per message: Ensure all tool calls in a single message are normalized.
  3. Array-type arguments: Test with "arguments": "[1,2,3]" to confirm array deserialization works.

Example test for case 1:

#[test]
fn test_normalize_tool_arguments_already_object() {
    let mut messages = serde_json::Value::Array(vec![serde_json::json!({
        "role": "assistant",
        "tool_calls": [{
            "type": "function",
            "function": {
                "name": "f",
                "arguments": {"key": "value"}  // Already an object
            }
        }]
    })]);
    
    normalize_tool_arguments_in_messages(&mut messages);
    
    // Should remain unchanged
    assert_eq!(
        messages[0]["tool_calls"][0]["function"]["arguments"],
        serde_json::json!({"key": "value"})
    );
}
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f509493 and eaadff1.

📒 Files selected for processing (1)
  • lib/llm/src/preprocessor/prompt/template/oai.rs (2 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
lib/llm/src/preprocessor/prompt/template/oai.rs (2)
lib/llm/src/preprocessor/prompt.rs (2)
  • messages (51-51)
  • model (50-50)
lib/llm/src/preprocessor/prompt/template/tokcfg.rs (1)
  • tojson (114-146)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
  • GitHub Check: sglang (amd64)
  • GitHub Check: operator (amd64)
  • GitHub Check: vllm (arm64)
  • GitHub Check: vllm (amd64)
  • GitHub Check: Build and Test - dynamo
  • GitHub Check: tests (launch/dynamo-run)
  • GitHub Check: clippy (lib/bindings/python)
  • GitHub Check: tests (lib/runtime/examples)
  • GitHub Check: tests (lib/bindings/python)
  • GitHub Check: tests (.)
  • GitHub Check: clippy (launch/dynamo-run)
  • GitHub Check: clippy (.)
🔇 Additional comments (1)
lib/llm/src/preprocessor/prompt/template/oai.rs (1)

163-180: Integration looks correct.

The normalization is properly positioned: after serialization and before template rendering. The unconditional application matches the PR objectives, and the order of operations with may_be_fix_msg_content is appropriate since they operate on different message fields.

@rmccorm4
Copy link
Contributor

rmccorm4 commented Nov 7, 2025

CC @2ez4bz

@indrajit96 indrajit96 self-requested a review November 7, 2025 19:24
@rmccorm4
Copy link
Contributor

rmccorm4 commented Nov 7, 2025

@ryan-lempka thanks for fixing this! based on the source issue:

  • looks like this PR addresses the escaped json in bullet point 1
  • does this PR also address the dictionary arguments case in bullet point 2 below? Seems like it might from test_normalize_tool_arguments_items_loop ?

E.g. this line in Qwen3 Coder's template:
{%- for args_name, args_value in tool_call.arguments|items %}.

For such templates, the dynamo frontend will return a 500 error code as it fails to render the template entirely.

@ryan-lempka
Copy link
Contributor Author

@ryan-lempka thanks for fixing this! based on the source issue:

  • looks like this PR addresses the escaped json in bullet point 1
  • does this PR also address the dictionary arguments case in bullet point 2 below? Seems like it might from test_normalize_tool_arguments_items_loop ?

E.g. this line in Qwen3 Coder's template:
{%- for args_name, args_value in tool_call.arguments|items %}.
For such templates, the dynamo frontend will return a 500 error code as it fails to render the template entirely.

@rmccorm4 yes, this PR aims to fix everything outlined in #4161. Everything is validated with unit tests. The example you give there is the unit test: test_normalize_tool_arguments_items_loop

@rmccorm4
Copy link
Contributor

rmccorm4 commented Nov 7, 2025

Thanks @ryan-lempka , think it just needs a rebase with merge conflicts resolved from your other merged add_generation_prompt PR, and a quick additional input case mentioned by @indrajit96 - nice work!

@rmccorm4 rmccorm4 added the frontend `python -m dynamo.frontend` and `dynamo-run in=http|text|grpc` label Nov 7, 2025
Copy link

@2ez4bz 2ez4bz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Not sure if my approval does anything, but doing it anyway :)

@rlempka
Copy link

rlempka commented Nov 7, 2025

Thanks @ryan-lempka , think it just needs a rebase with merge conflicts resolved from your other merged add_generation_prompt PR, and a quick additional input case mentioned by @indrajit96 - nice work!

@rmccorm4 rebase done and @indrajit96 test request added

@rmccorm4 rmccorm4 enabled auto-merge (squash) November 7, 2025 21:27
@rmccorm4
Copy link
Contributor

rmccorm4 commented Nov 8, 2025

Pulled in #4198

@rmccorm4 rmccorm4 merged commit 51c4fe6 into main Nov 8, 2025
34 of 36 checks passed
@rmccorm4 rmccorm4 deleted the rlempka/fix-deserialize-tool-call-args branch November 8, 2025 02:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

fix frontend `python -m dynamo.frontend` and `dynamo-run in=http|text|grpc` size/L

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants