Skip to content

Conversation

ZhaoJiangJiang
Copy link
Contributor

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the error handling for when both DBO and speculative decoding are enabled, moving the check to an earlier stage in the configuration validation. This is a good improvement for user experience. However, a critical typo was introduced in vllm_ascend/platform.py that will cause a runtime error and needs to be fixed.

if enforce_eager or compilation_config.level == CompilationLevel.NO_COMPILATION:
logger.info("Compilation disabled, using eager mode by default")
compilation_config.level = CompilationLevel.NO_COMPILATION
elif compilation_config.level != CompilationLevel.PIECEWISE:
logger.warning(
"NPU does not support %s compilation level. Setting level to NO_COMPILATION",
compilation_config.level)
compilation_config.level = CompilationLevel.NO_COMPILATION
compilation_config.level = CompilationLevel.NfvO_COMPILATION
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

There appears to be a typo in the compilation level. NfvO_COMPILATION should be NO_COMPILATION. This will likely cause an AttributeError at runtime.

Suggested change
compilation_config.level = CompilationLevel.NfvO_COMPILATION
compilation_config.level = CompilationLevel.NO_COMPILATION

Signed-off-by: 赵江江 <[email protected]>
Copy link

This pull request has conflicts, please resolve those before we can evaluate the pull request.

Copy link

This pull request has conflicts, please resolve those before we can evaluate the pull request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants