[Refactor]: Phase1 for rebasing_additional_info#1394
[Refactor]: Phase1 for rebasing_additional_info#1394tzhouam merged 18 commits intovllm-project:mainfrom
Conversation
Signed-off-by: dsinghvi <divyanshsinghvi@gmail.com>
444f1f4 to
23b2308
Compare
Signed-off-by: dsinghvi <divyanshsinghvi@gmail.com>
vllm_omni/worker/gpu_model_runner.py
Outdated
| try: | ||
| if getattr(new_req_data, "additional_information", None) is not None: | ||
| warnings.warn( | ||
| "additional_information on request data is deprecated, use model_intermediate_buffer", |
There was a problem hiding this comment.
These deprecation warnings fire on every request with the old field name. In high-throughput scenarios that could get noisy — worth gating with a _compat_warned flag so they only fire once per process.
There was a problem hiding this comment.
@lishunyang12 Doesn't warnings.warn default filter handle it by itself? Usually I think it filters with filename,linenumber indexing?
Or the concern was because it's only being filtered but the code will run it and then only during logging it's filtered could be a problem in high throughput situation.
Though I think it's better to add flag for clarity.
Signed-off-by: dsinghvi <divyanshsinghvi@gmail.com>
Signed-off-by: dsinghvi <divyanshsinghvi@gmail.com>
Signed-off-by: dsinghvi <divyanshsinghvi@gmail.com>
Signed-off-by: dsinghvi <divyanshsinghvi@gmail.com>
Signed-off-by: dsinghvi <divyanshsinghvi@gmail.com>
|
cc: @tzhouam |
PR ReviewThis PR implements Phase 1 of RFC #1351 to rebase Summary of Changes
Strengths
Blocking Issues
Non-blocking Concerns
VerdictConditional Approve — The Phase 1 implementation is solid and well-aligned with the RFC. Please address the missing e2e test results before merging. |
|
Qwen 2.5 omni may need similar modifications |
PR Review: Phase 1 for rebasing additional_infoThis PR implements Phase 1 of RFC #1351 to rebase Summary of Changes
Strengths
Inline Comments PostedSee specific line-level comments for:
Non-blocking Concerns
VerdictConditional Approve — The Phase 1 implementation is solid and well-aligned with the RFC. Please address the typo and run the missing e2e tests before merging. Note: Once Phase 2 and Phase 3 are complete and the old paths are removed, the dual-write overhead and alias naming will become moot. |
|
@vllm-omni-reviewer |
Qwen 2.5 doesn't use runtime_additional_info directly so we won't be requiring that, it's being passed from info_dict only which should be handled by the current internal code. Qwen3 has some info passed from info_dict and runtime_additional_info also. So had to add if else. There are "dead" additional_info in forward() calls for both, but I think we can remove them in second stage. |
Signed-off-by: dsinghvi <divyanshsinghvi@gmail.com>
This reverts commit 57814cb. Signed-off-by: Divyansh Singhvi <divyanshsinghvi@gmail.com>
Signed-off-by: dsinghvi <divyanshsinghvi@gmail.com>
8847d35 to
69407da
Compare
|
@amy-why-3459 @R2-Y Please review this PR if it affects the async chunk as the runtime additional information will not be passed to scheduler in the future design. |
Signed-off-by: Zhou Taichang <tzhouam@connect.ust.hk>
tzhouam
left a comment
There was a problem hiding this comment.
I have added some comments here.
Please see if it helps.
|
Also please solve the conflicts then we can merge. |
Signed-off-by: dsinghvi <divyanshsinghvi@gmail.com>
60a101d to
66ba84a
Compare
Signed-off-by: dsinghvi <divyanshsinghvi@gmail.com>
Signed-off-by: dsinghvi <divyanshsinghvi@gmail.com>
Signed-off-by: dsinghvi <divyanshsinghvi@gmail.com>
PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS (AT THE BOTTOM) HAVE BEEN CONSIDERED.
Purpose
#1351
Test Plan
Test Result
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model. Please runmkdocs serveto sync the documentation editions to./docs.BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)