-
Notifications
You must be signed in to change notification settings - Fork 975
[bugfix]fixed block_size incorrect setting issue in dsv3.2 #7630
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
25c34fe
baf0dd3
e402ae1
c3ab361
367f004
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -1097,12 +1097,12 @@ def refresh_block_size(vllm_config): | |
| if not scheduler_config or not model_config: | ||
| return | ||
|
|
||
| # TODO(MengqingCao): Remove the model_type check, after resolving the hidden error in get_kv_cache_groups. | ||
| if ( | ||
| "qwen3_next" not in model_config.hf_text_config.model_type | ||
| and "qwen3_5" not in model_config.hf_text_config.model_type | ||
| and cache_config.block_size != 128 | ||
| ): | ||
| if model_config.is_hybrid: | ||
| # Hybrid attention+mamba models rely on the model-specific sizing | ||
| # logic rather than the generic platform default. | ||
| return | ||
|
Comment on lines
+1100
to
+1103
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The if cache_config.user_specified_block_size:
# User specified --block-size; keep it.
return
if model_config.is_hybrid:
# Hybrid attention+mamba models rely on the model-specific sizing
# logic rather than the generic platform default.
return |
||
|
|
||
| if cache_config.block_size != 128: | ||
| if cache_config.enable_prefix_caching or scheduler_config.enable_chunked_prefill: | ||
| logger.info("Block size is set to 128 if prefix cache or chunked prefill is enabled.") | ||
| cache_config.block_size = 128 | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The
update_block_size_for_backendmethod has been refactored to apassstatement with aTODO. While theis_hybridmodel logic has been correctly moved torefresh_block_size(which is called bycheck_and_update_config), the critical check forcache_config.user_specified_block_sizehas been removed from this call path. This omission means that user-defined block sizes might be unintentionally overridden, leading to unexpected behavior. TheTODOalso highlights that the block size selection logic is not yet fully centralized, indicating an incomplete refactoring.