-
Notifications
You must be signed in to change notification settings - Fork 39
Refactor attention block smoothing for consistency #205
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
|
|
@@ -17,6 +17,35 @@ | |||||
| import torch | ||||||
|
|
||||||
|
|
||||||
| def block_smooth( | ||||||
| attention_mask: torch.Tensor, | ||||||
| key_len: int, | ||||||
| block_size: int, | ||||||
| ): | ||||||
| if block_size <= 0: | ||||||
|
||||||
| if block_size <= 0: | |
| if not isinstance(block_size, int) or block_size <= 0: |
Copilot
AI
Nov 6, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is trailing whitespace on line 47. Remove the extra whitespace after the return statement.
Copilot
AI
Nov 6, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The .to(torch.float) conversion was removed from the torch.topk call. While this may be intentional to preserve the original dtype, it changes the existing behavior. If attention_bias is not already torch.float, this could affect numerical precision in the topk operation. Verify this is the intended behavior or document why the dtype conversion was removed.
| attention_bias, | |
| attention_bias.to(torch.float), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The newly introduced
block_smoothfunction is missing a docstring. Add documentation explaining its purpose, parameters, and return value to maintain consistency with other functions in the module liketopk_maskandrelu_mask.