Skip to content

Conversation

@LoserCheems
Copy link
Collaborator

Introduce selectable masking strategies for attention mechanisms, enabling both top-k and ReLU flows. This allows for experimentation with bias-driven sparsity and normalizes the top-k path to prevent misuse of unsupported mask types.

Introduces selectable masking strategies to support both top-k and ReLU flows, enabling experimentation with bias-driven sparsity.
Normalizes the top-k path to use detached bias casting and rejects unsupported mask types to avoid silent misuse.
Copilot AI review requested due to automatic review settings November 6, 2025 05:03
@LoserCheems LoserCheems merged commit f147ff1 into main Nov 6, 2025
3 of 4 checks passed
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors the mask generation functionality to support multiple mask types. It renames dynamic_mask to topk_mask, introduces a new relu_mask function, and adds a type parameter to create_mask to select between different masking strategies.

  • Renamed dynamic_mask function to topk_mask for clarity
  • Added new relu_mask function implementing ReLU-based masking
  • Updated create_mask to support both "topk" and "relu" mask types via a new type parameter

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

attention_mask = relu_mask(
attention_bias=attention_bias,
attention_mask=attention_mask,
window_size=window_size,
Copy link

Copilot AI Nov 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The relu_mask function does not accept a window_size parameter, but it's being passed here. This will cause an error when type == 'relu' is used. The window_size parameter should be removed from this function call, or the relu_mask function signature should be updated to accept and use it.

Suggested change
window_size=window_size,

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants