Skip to content

Conversation

@ryan-lempka
Copy link
Contributor

@ryan-lempka ryan-lempka commented Nov 6, 2025

Overview:

Add generation prompt unless the last message was Assistant(_)
(defaults to true when no messages exist).

Added unit tests and local test results here after change applied:

(venv) ubuntu@workstation:/workspace$ python test_tool_call_simple.py
ChatCompletionMessage(content=None, refusal=None, role='assistant', annotations=None, audio=None, function_call=None, tool_calls=[ChatCompletionMessageFunctionToolCall(id='call-72ec0ccd-828a-4bb8-88de-12b123684ee4', function=Function(arguments='{"location":"San Francisco, CA","format":"celsius"}', name='get_current_weather'), type='function')], reasoning_content=None)
ChatCompletionMessage(content='The current weather in San Francisco, CA is88 degrees Celsius.', refusal=None, role='assistant', annotations=None, audio=None, function_call=None, tool_calls=None, reasoning_content=None)

Details:

Change the behavior of should_add_generation_prompt to not simply check if the last message was User(_) but to check if the last message was not Assistant(_) and then return true in that case. This fixes issues such as those found in bug #4013 where the last message being Tool(_) caused the model to leak header ids into the return content.

Where should the reviewer start?

lib/llm/src/preprocessor/prompt/template/oai.rs

`#### Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

  • Fixes GitHub issue: #4013

Summary by CodeRabbit

  • Bug Fixes

    • Improved prompt generation logic in chat completions to better handle different conversation message types, including improved handling of assistant-generated messages.
  • Tests

    • Expanded test coverage for prompt generation across various message type scenarios to ensure proper handling in different conversation states.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 6, 2025

Walkthrough

Modified the should_add_generation_prompt logic for NvCreateChatCompletionRequest to return true when the last message is not an Assistant message, or when no last message exists. Previously, it returned true only for User messages or missing messages. Tests expanded to cover new scenarios for Assistant, User, and Tool message endings.

Changes

Cohort / File(s) Summary
Prompt generation logic and test expansion
lib/llm/src/preprocessor/prompt/template/oai.rs
Modified should_add_generation_prompt control flow to treat Assistant messages as blocking prompt generation. Added test helpers for User, Assistant, and Tool message construction. Introduced tests for add_after_user, add_after_tool, no_after_assistant, and add_when_empty scenarios.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

  • Area of focus: The logic change to should_add_generation_prompt inverts the condition from checking for User messages to checking for non-Assistant messages; verify this behavior aligns with intended chat completion semantics
  • Test coverage verification: Ensure the new test scenarios comprehensively cover the revised logic paths, particularly the Assistant message case which now behaves differently

Poem

🐰 A prompt flows true, when Assistants stay,
User whispers welcome, tools at play,
But when the rabbit speaks last with care,
No new prompt appears—just quiet air! ✨

Pre-merge checks

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 53.85% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main change: fixing a bug in the should_add_generation_prompt function related to multi-turn conversations.
Description check ✅ Passed The pull request description includes all required sections from the template: Overview, Details, Where should the reviewer start, and Related Issues with proper action keywords.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
lib/llm/src/preprocessor/prompt/template/oai.rs (1)

726-748: LGTM! Comprehensive test coverage for the bug fix.

The tests clearly validate the new behavior, particularly add_after_tool which directly addresses the bug where Tool messages need a generation prompt.

Consider adding tests for additional edge cases to further strengthen coverage:

  • System message as the last message
  • Multi-turn scenarios (e.g., vec![user(), asst(), tool()])

Example:

#[test]
fn add_after_system() {
    let sys = Msg::System(Default::default());
    let s = dummy_state(vec![sys]);
    assert!(s.should_add_generation_prompt());
}

#[test]
fn add_after_multiturn_with_tool() {
    let s = dummy_state(vec![user(), asst(), tool()]);
    assert!(s.should_add_generation_prompt());
}
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between eedfc3d and 9d318c5.

📒 Files selected for processing (1)
  • lib/llm/src/preprocessor/prompt/template/oai.rs (3 hunks)
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: nachiketb-nvidia
Repo: ai-dynamo/dynamo PR: 2700
File: lib/llm/src/protocols/openai/chat_completions/delta.rs:19-28
Timestamp: 2025-08-25T22:04:45.205Z
Learning: The response_generator() method exists on multiple request types in the codebase: NvCreateChatCompletionRequest (for chat completions) and NvCreateCompletionRequest (for text completions). When making signature changes, it's important to distinguish between these different object types as they have separate implementations and call sites.
🧬 Code graph analysis (1)
lib/llm/src/preprocessor/prompt/template/oai.rs (1)
lib/llm/src/preprocessor/prompt.rs (2)
  • messages (51-51)
  • should_add_generation_prompt (59-59)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (13)
  • GitHub Check: trtllm (amd64)
  • GitHub Check: sglang (arm64)
  • GitHub Check: sglang (amd64)
  • GitHub Check: operator (amd64)
  • GitHub Check: vllm (amd64)
  • GitHub Check: Build and Test - dynamo
  • GitHub Check: tests (lib/runtime/examples)
  • GitHub Check: clippy (launch/dynamo-run)
  • GitHub Check: tests (launch/dynamo-run)
  • GitHub Check: clippy (lib/bindings/python)
  • GitHub Check: tests (lib/bindings/python)
  • GitHub Check: tests (.)
  • GitHub Check: clippy (.)
🔇 Additional comments (3)
lib/llm/src/preprocessor/prompt/template/oai.rs (3)

165-177: LGTM! Fix correctly addresses the Tool message bug.

The new logic properly handles the case where a Tool message is last—previously the generation prompt was only added for User messages, causing the model to leak header IDs when processing tool results. Now the prompt is added for any non-Assistant final message.


300-300: LGTM! Type alias improves test readability.


708-724: LGTM! Well-designed test helpers.

The helper functions appropriately use Default::default() since should_add_generation_prompt only examines message type, not content.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants