Skip to content

Commit 231a3e8

Browse files
Remove unneeded flex_decoding.patch (#4962)
``` - max( - next_power_of_2( - V.graph.sizevars.size_hint( - seq_len_q, - fallback=torch._inductor.config.unbacked_symint_fallback, # type: ignore[arg-type] - ) - * gqa_shared_heads - ), - 1 if torch.xpu.is_available() else 16, + next_power_of_2( + V.graph.sizevars.size_hint( + seq_len_q, + fallback=torch._inductor.config.unbacked_symint_fallback, # type: ignore[arg-type] + ) + * gqa_shared_heads ``` On xpu, the results are equivalent, `max(next_power_of_2, 1) == next_power_of_2`. CI: https://github.com/intel/intel-xpu-backend-for-triton/actions/runs/17266774537/job/49000662056 (no regression) Signed-off-by: Whitney Tsang <[email protected]>
1 parent bfbdc55 commit 231a3e8

File tree

2 files changed

+0
-27
lines changed

2 files changed

+0
-27
lines changed

scripts/patch-pytorch.sh

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -36,4 +36,3 @@ echo "Applying PyTorch patches in $REPO_ROOT"
3636

3737
# put your patch applies here
3838
apply_patch ./patch/flex_attn_143553.patch
39-
apply_patch ./patch/flex_decoding.patch

scripts/patch/flex_decoding.patch

Lines changed: 0 additions & 26 deletions
This file was deleted.

0 commit comments

Comments
 (0)