[cherry-pick]Upgrade CANN to 8.3.rc1 (#3945)#3962
[cherry-pick]Upgrade CANN to 8.3.rc1 (#3945)#3962wangxiyuan merged 1 commit intovllm-project:v0.11.0-devfrom
Conversation
|
👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:
If CI fails, you can run linting and testing checks locally according Contributing and Testing. |
452b4a0 to
922c8b9
Compare
There was a problem hiding this comment.
Code Review
This pull request upgrades the CANN dependency from 8.2.rc1 to 8.3.rc1. The changes primarily involve updating version strings in Dockerfiles, documentation, and configuration files. Additionally, conditional logic based on the CANN version has been removed from the Python source code, making the 8.3-specific logic the default. While most changes are straightforward, I've identified an issue in one of the unit tests that appears to have been modified incorrectly, which could lead to it not properly testing the intended functionality.
| self.assertEqual(attn_mask[0][-1], torch.tensor(True)) | ||
| self.assertEqual(attention_mask_builder._seq_len_cached, 1024) | ||
| self.assertEqual(attention_mask_builder.attn_mask_cache.shape, | ||
| (2048, 2048)) | ||
| (1024, 1024)) |
There was a problem hiding this comment.
The assertions in this part of the test appear to be incorrect. The get_attn_mask method calls _update_attn_cache, which should update _seq_len_cached to 2048 when called with max_seq_len=2048 (since it was initialized with 1024). However, the test asserts that _seq_len_cached remains 1024. This contradicts the implementation of _update_attn_cache and the comment on line 73.
Additionally, attn_mask[0][-1] is asserted to be torch.tensor(True). For dtype=torch.float16, _generate_attn_mask creates a mask with float("-inf"). This assertion also seems incorrect.
If _update_attn_cache does not update as the test suggests, get_attn_mask would try to slice [:2048, :2048] from a (1024, 1024) tensor, which would raise an error. The test seems to be asserting an impossible state. It appears the original test logic was correct and should be restored.
self.assertEqual(attn_mask[0][-1],
torch.tensor(float("-inf"), dtype=torch.float16))
self.assertEqual(attention_mask_builder._seq_len_cached, 2048)
self.assertEqual(attention_mask_builder.attn_mask_cache.shape,
(2048, 2048))94f8ff1 to
d21af27
Compare
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
d21af27 to
338f3b4
Compare
…t#3962) This PR upgrade CANN from 8.2rc1 to 8.3rc1 and remove the CANN version check logic. TODO: we notice that UT runs failed with CANN 8.3 image. So the base image for UT is still 8.2. We'll fix it later. - vLLM version: v0.11.0 - vLLM main: vllm-project/vllm@83f478b Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
This PR upgrade CANN from 8.2rc1 to 8.3rc1 and remove the CANN version check logic.
TODO: we notice that UT runs failed with CANN 8.3 image. So the base image for UT is still 8.2. We'll fix it later.