feat: add flags for multi_round pipeline to return logprob. by DragonFive · Pull Request #993 · jd-opensource/xllm

DragonFive · 2026-03-04T08:12:44Z

PR Description

Summary

This PR adds optional REC logprob output for pure-device multi-round beam results on top of main.

Background

main already has the beam ranking order fix (descending by beam logprob).
The remaining gap was REC multi-round output logprobs in generate_multi_round_output.

What changed

Added a new global flag:

FLAGS_output_rec_logprobs (default: false)

Updated REC multi-round output behavior in SequencesGroup::generate_multi_round_output:

If FLAGS_output_rec_logprobs == true:
- Return a full logprobs list aligned with token_ids
- Fill missing per-token logprobs with last_lps[beam_idx]
If FLAGS_output_rec_logprobs == false:
- Do not return out.logprobs

Kept beam ranking logic unchanged.

Files changed

xllm/core/common/global_flags.h
xllm/core/common/global_flags.cpp
xllm/core/framework/request/sequences_group.cpp

Behavior matrix

output_rec_logprobs=false (default): no REC multi-round out.logprobs
output_rec_logprobs=true: token-aligned out.logprobs, filled by final beam logprob

Compatibility / performance

Default is false, so existing behavior remains unchanged unless explicitly enabled.
Enabling the flag increases output payload and per-output construction work proportionally to token count.

Validation

Verified symbol wiring and code paths (DECLARE/DEFINE/usage).
Verified only target files were modified for this feature.
Full build/test was not run in this patch.

xllm/core/common/global_flags.cpp

gemini-code-assist

Code Review

This pull request introduces a new flag, output_rec_logprobs, to enable the output of token-aligned log probabilities for multi-round recommendation pipelines. While the feature is a good addition, the current implementation has a couple of issues. First, it uses an incorrect default value for log probabilities in an edge case, which could lead to misleading results. Second, there is a performance concern due to inefficient token decoding within a loop. I have provided specific comments and code suggestions to address these points.

xllm/core/framework/request/sequences_group.cpp

feat: add flags for multi_round pipeline to return logprobs.

a01291e

DragonFive requested review from DongheJin, JimHsiung, RobbieLeung, XuZhang99, liutongxuan, walsonyang and yq33victor as code owners March 4, 2026 08:12

XuZhang99 reviewed Mar 4, 2026

View reviewed changes

xllm/core/common/global_flags.cpp Show resolved Hide resolved

gemini-code-assist bot reviewed Mar 4, 2026

View reviewed changes

xllm/core/framework/request/sequences_group.cpp Outdated Show resolved Hide resolved

xllm/core/framework/request/sequences_group.cpp Outdated Show resolved Hide resolved

fix: fix some comment.

72fbde0

XuZhang99 approved these changes Mar 4, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add flags for multi_round pipeline to return logprob.#993

feat: add flags for multi_round pipeline to return logprob.#993
DragonFive wants to merge 2 commits intojd-opensource:mainfrom
DragonFive:feat/multiround-complete-logprobs-flag

DragonFive commented Mar 4, 2026 •

edited

Loading

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

DragonFive commented Mar 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Background

What changed

Files changed

Behavior matrix

Compatibility / performance

Validation

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

DragonFive commented Mar 4, 2026 •

edited

Loading