fix: multi-turn bug in should_add_generation_prompt #4168

ryan-lempka · 2025-11-06T23:44:02Z

Overview:

Add generation prompt unless the last message was Assistant(_)
(defaults to true when no messages exist).

Added unit tests and local test results here after change applied:

(venv) ubuntu@workstation:/workspace$ python test_tool_call_simple.py
ChatCompletionMessage(content=None, refusal=None, role='assistant', annotations=None, audio=None, function_call=None, tool_calls=[ChatCompletionMessageFunctionToolCall(id='call-72ec0ccd-828a-4bb8-88de-12b123684ee4', function=Function(arguments='{"location":"San Francisco, CA","format":"celsius"}', name='get_current_weather'), type='function')], reasoning_content=None)
ChatCompletionMessage(content='The current weather in San Francisco, CA is88 degrees Celsius.', refusal=None, role='assistant', annotations=None, audio=None, function_call=None, tool_calls=None, reasoning_content=None)

Details:

Change the behavior of should_add_generation_prompt to not simply check if the last message was User(_) but to check if the last message was not Assistant(_) and then return true in that case. This fixes issues such as those found in bug #4013 where the last message being Tool(_) caused the model to leak header ids into the return content.

Where should the reviewer start?

lib/llm/src/preprocessor/prompt/template/oai.rs

`#### Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

Fixes GitHub issue: #4013

Summary by CodeRabbit

Bug Fixes
- Improved prompt generation logic in chat completions to better handle different conversation message types, including improved handling of assistant-generated messages.
Tests
- Expanded test coverage for prompt generation across various message type scenarios to ensure proper handling in different conversation states.

Signed-off-by: Ryan Lempka <[email protected]>

coderabbitai · 2025-11-06T23:46:59Z

Walkthrough

Modified the should_add_generation_prompt logic for NvCreateChatCompletionRequest to return true when the last message is not an Assistant message, or when no last message exists. Previously, it returned true only for User messages or missing messages. Tests expanded to cover new scenarios for Assistant, User, and Tool message endings.

Changes

Cohort / File(s)	Summary
Prompt generation logic and test expansion `lib/llm/src/preprocessor/prompt/template/oai.rs`	Modified `should_add_generation_prompt` control flow to treat Assistant messages as blocking prompt generation. Added test helpers for User, Assistant, and Tool message construction. Introduced tests for `add_after_user`, `add_after_tool`, `no_after_assistant`, and `add_when_empty` scenarios.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Area of focus: The logic change to should_add_generation_prompt inverts the condition from checking for User messages to checking for non-Assistant messages; verify this behavior aligns with intended chat completion semantics
Test coverage verification: Ensure the new test scenarios comprehensively cover the revised logic paths, particularly the Assistant message case which now behaves differently

Poem

🐰 A prompt flows true, when Assistants stay,
User whispers welcome, tools at play,
But when the rabbit speaks last with care,
No new prompt appears—just quiet air! ✨

Pre-merge checks

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 53.85% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately describes the main change: fixing a bug in the should_add_generation_prompt function related to multi-turn conversations.
Description check	✅ Passed	The pull request description includes all required sections from the template: Overview, Details, Where should the reviewer start, and Related Issues with proper action keywords.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

lib/llm/src/preprocessor/prompt/template/oai.rs (1)
726-748: LGTM! Comprehensive test coverage for the bug fix.

The tests clearly validate the new behavior, particularly add_after_tool which directly addresses the bug where Tool messages need a generation prompt.

Consider adding tests for additional edge cases to further strengthen coverage:

System message as the last message

Multi-turn scenarios (e.g., vec![user(), asst(), tool()])

Example:
#[test]
fn add_after_system() {
    let sys = Msg::System(Default::default());
    let s = dummy_state(vec![sys]);
    assert!(s.should_add_generation_prompt());
}

#[test]
fn add_after_multiturn_with_tool() {
    let s = dummy_state(vec![user(), asst(), tool()]);
    assert!(s.should_add_generation_prompt());
}

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between eedfc3d and 9d318c5.

📒 Files selected for processing (1)

lib/llm/src/preprocessor/prompt/template/oai.rs (3 hunks)

🧰 Additional context used

🧠 Learnings (1)

📓 Common learnings

Learnt from: nachiketb-nvidia
Repo: ai-dynamo/dynamo PR: 2700
File: lib/llm/src/protocols/openai/chat_completions/delta.rs:19-28
Timestamp: 2025-08-25T22:04:45.205Z
Learning: The response_generator() method exists on multiple request types in the codebase: NvCreateChatCompletionRequest (for chat completions) and NvCreateCompletionRequest (for text completions). When making signature changes, it's important to distinguish between these different object types as they have separate implementations and call sites.

🧬 Code graph analysis (1)

lib/llm/src/preprocessor/prompt/template/oai.rs (1)

lib/llm/src/preprocessor/prompt.rs (2)

messages (51-51)

should_add_generation_prompt (59-59)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (13)

GitHub Check: trtllm (amd64)
GitHub Check: sglang (arm64)
GitHub Check: sglang (amd64)
GitHub Check: operator (amd64)
GitHub Check: vllm (amd64)
GitHub Check: Build and Test - dynamo
GitHub Check: tests (lib/runtime/examples)
GitHub Check: clippy (launch/dynamo-run)
GitHub Check: tests (launch/dynamo-run)
GitHub Check: clippy (lib/bindings/python)
GitHub Check: tests (lib/bindings/python)
GitHub Check: tests (.)
GitHub Check: clippy (.)

🔇 Additional comments (3)

lib/llm/src/preprocessor/prompt/template/oai.rs (3)

165-177: LGTM! Fix correctly addresses the Tool message bug.

The new logic properly handles the case where a Tool message is last—previously the generation prompt was only added for User messages, causing the model to leak header IDs when processing tool results. Now the prompt is added for any non-Assistant final message.

300-300: LGTM! Type alias improves test readability.

708-724: LGTM! Well-designed test helpers.

The helper functions appropriately use Default::default() since should_add_generation_prompt only examines message type, not content.

fix: multi-turn generation prompt method

9d318c5

Signed-off-by: Ryan Lempka <[email protected]>

ryan-lempka requested review from ayushag-nv, elyasmnvidian and rmccorm4 November 6, 2025 23:44

ryan-lempka requested a review from a team as a code owner November 6, 2025 23:44

pull-request-size bot added the size/M label Nov 6, 2025

github-actions bot added the fix label Nov 6, 2025

coderabbitai bot reviewed Nov 6, 2025

View reviewed changes

rmccorm4 mentioned this pull request Nov 7, 2025

feat: Add support for skip_special_tokens parameter in v1/completions and v1/chat/completions endpoints #4175

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: multi-turn bug in should_add_generation_prompt #4168

fix: multi-turn bug in should_add_generation_prompt #4168

ryan-lempka commented Nov 6, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Nov 6, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fix: multi-turn bug in should_add_generation_prompt #4168

Are you sure you want to change the base?

fix: multi-turn bug in should_add_generation_prompt #4168

Conversation

ryan-lempka commented Nov 6, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview:

Details:

Where should the reviewer start?

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Nov 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

Pre-merge checks

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ryan-lempka commented Nov 6, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Nov 6, 2025 •

edited

Loading