Skip to content
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 28 additions & 0 deletions tests/unittest/llmapi/apps/_test_openai_completions.py
Original file line number Diff line number Diff line change
Expand Up @@ -204,6 +204,34 @@ async def test_batch_completions_streaming(async_client: openai.AsyncOpenAI,
assert texts[0] == texts[1]


@pytest.mark.asyncio(loop_scope="module")
@pytest.mark.parametrize("prompts",
[["Hello, my name is"] * 2, [[0, 0, 0, 0, 0]] * 2])
async def test_batch_completions_beam_search_streaming(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this test executed in pre-merge or post-merge? i'm trying to follow the logic through the nested ways to invoke tests - will this test end up calling it? https://github.com/NVIDIA/TensorRT-LLM/blob/main/tests/integration/defs/test_e2e.py#L1624

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's on the pre-merge pipeline.

  1. e2e test function:
    def test_openai_completions_example(llm_root, llm_venv, backend: str):
  2. Test list:
    - test_e2e.py::test_openai_completions_example[pytorch]

async_client_with_beam_search: openai.AsyncOpenAI, model_name, prompts):
# test beam search with streaming
batch = await async_client_with_beam_search.completions.create(
model=model_name,
prompt=prompts,
n=2,
max_tokens=5,
temperature=0.0,
stream=True,
extra_body=dict(use_beam_search=True),
)
texts = [""] * 4 # 2 prompts × 2 beams = 4 choices
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Replace Unicode multiplication sign with ASCII 'x'.

The comment uses × (Unicode MULTIPLICATION SIGN) instead of ASCII x. For consistency and to avoid potential encoding issues, use the ASCII character.

Apply this diff:

-    texts = [""] * 4  # 2 prompts × 2 beams = 4 choices
+    texts = [""] * 4  # 2 prompts x 2 beams = 4 choices

Based on static analysis hints.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
texts = [""] * 4 # 2 prompts × 2 beams = 4 choices
texts = [""] * 4 # 2 prompts x 2 beams = 4 choices
🧰 Tools
🪛 Ruff (0.14.5)

222-222: Comment contains ambiguous × (MULTIPLICATION SIGN). Did you mean x (LATIN SMALL LETTER X)?

(RUF003)

🤖 Prompt for AI Agents
In tests/unittest/llmapi/apps/_test_openai_completions.py around line 222, the
inline comment uses the Unicode multiplication sign '×'; replace it with the
ASCII letter 'x' so the comment reads "2 prompts × 2 beams = 4 choices" → "2
prompts x 2 beams = 4 choices" to ensure consistent ASCII encoding and avoid
potential encoding issues.

async for chunk in batch:
assert len(chunk.choices) == 1
choice = chunk.choices[0]
texts[choice.index] += choice.text

# Verify beam search produces different outputs for different beams
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, however https://nvbugs/5625743 does not pass use_beam_search=True (the default is False).

So, ideally we could parametrize the test to also cover use_beam_search=False. For this case, in order not to hit this check, you could specify a very small (but non-zero) temperature. The check in L229 would need to be skipped then, since decoding is greedy-like for temperature -> 0 (unless logits are tied).

assert texts[0] != texts[1], "beam search should produce different outputs"
# Verify the two copies of the same prompt produce the same beams
assert texts[0] == texts[2], "same prompt should produce same first beam"
assert texts[1] == texts[3], "same prompt should produce same second beam"


@pytest.mark.asyncio(loop_scope="module")
async def test_completion_stream_options(async_client: openai.AsyncOpenAI,
model_name: str):
Expand Down
Loading