Update CuTe namespace and enhance dependencies#262
Conversation
…improved error handling
…ronization and improve error handling
There was a problem hiding this comment.
Pull request overview
This PR updates the vendored CuTe/FlashAttention-4 integration to live under the flash_sparse_attn.ops.cute namespace, updates packaging metadata (Python >= 3.10 and new optional deps), and improves the subtree sync scripts by adding namespace-rewrite and a temporary-worktree flow for dirty repos.
Changes:
- Refactor CuTe Python sources to import via
flash_sparse_attn.ops.cuteinstead offlash_attn.cute. - Add a
rewrite_cute_namespace.pyscript and integrate it into the subtree sync scripts (bash + PowerShell), including a temporary worktree mode. - Update
pyproject.tomlto require Python >= 3.10 and add acuteoptional-dependency extra.
Reviewed changes
Copilot reviewed 27 out of 27 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| scripts/sync_cute_subtree.sh | Adds temporary-worktree sync path, CuTe import rewrite step, and improved reporting. |
| scripts/sync_cute_subtree.ps1 | PowerShell equivalent of the enhanced sync workflow with rewrite + temporary worktree support. |
| scripts/rewrite_cute_namespace.py | New helper to rewrite vendored CuTe Python imports to the local namespace. |
| pyproject.toml | Bumps minimum Python version and adds cute + enhanced dev optional dependencies. |
| flash_sparse_attn/ops/cute/init.py | Updates distribution version lookup and patches compile helper import path. |
| flash_sparse_attn/ops/cute/tile_scheduler.py | Rewrites internal imports to the new CuTe namespace. |
| flash_sparse_attn/ops/cute/softmax.py | Rewrites internal imports to the new CuTe namespace. |
| flash_sparse_attn/ops/cute/paged_kv.py | Rewrites internal imports to the new CuTe namespace. |
| flash_sparse_attn/ops/cute/pack_gqa.py | Rewrites internal imports to the new CuTe namespace. |
| flash_sparse_attn/ops/cute/mask.py | Rewrites internal imports to the new CuTe namespace. |
| flash_sparse_attn/ops/cute/interface.py | Rewrites internal imports to the new CuTe namespace. |
| flash_sparse_attn/ops/cute/flash_fwd.py | Rewrites internal imports and updates SM90 lazy import to new namespace. |
| flash_sparse_attn/ops/cute/flash_fwd_sm90.py | Rewrites internal imports to the new CuTe namespace. |
| flash_sparse_attn/ops/cute/flash_fwd_sm100.py | Rewrites internal imports to the new CuTe namespace. |
| flash_sparse_attn/ops/cute/flash_fwd_sm120.py | Rewrites internal imports to the new CuTe namespace. |
| flash_sparse_attn/ops/cute/flash_fwd_combine.py | Rewrites internal imports to the new CuTe namespace. |
| flash_sparse_attn/ops/cute/flash_bwd.py | Rewrites internal imports to the new CuTe namespace. |
| flash_sparse_attn/ops/cute/flash_bwd_sm90.py | Rewrites internal imports to the new CuTe namespace. |
| flash_sparse_attn/ops/cute/flash_bwd_sm100.py | Rewrites internal imports to the new CuTe namespace. |
| flash_sparse_attn/ops/cute/flash_bwd_sm120.py | Rewrites internal imports to the new CuTe namespace. |
| flash_sparse_attn/ops/cute/flash_bwd_preprocess.py | Rewrites internal imports to the new CuTe namespace. |
| flash_sparse_attn/ops/cute/flash_bwd_postprocess.py | Rewrites internal imports to the new CuTe namespace. |
| flash_sparse_attn/ops/cute/compute_block_sparsity.py | Rewrites internal imports to the new CuTe namespace. |
| flash_sparse_attn/ops/cute/block_sparsity.py | Rewrites internal imports to the new CuTe namespace. |
| flash_sparse_attn/ops/cute/block_sparse_utils.py | Rewrites internal imports to the new CuTe namespace. |
| flash_sparse_attn/ops/cute/block_info.py | Rewrites internal imports to the new CuTe namespace. |
| flash_sparse_attn/ops/cute/blackwell_helpers.py | Rewrites internal imports to the new CuTe namespace. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| echo "Done." | ||
| echo "Upstream source: $UPSTREAM_REPO" | ||
| echo "Upstream cache used for subtree split: $UPSTREAM_REPO_FOR_SPLIT" | ||
| echo "Synced commit range: $SYNC_START_HEAD -> $SYNC_END_HEAD" |
There was a problem hiding this comment.
The reported synced commit range can be inaccurate when syncing via a temporary worktree. In particular, if the rewrite commit is skipped (because there are existing local changes under $PREFIX) then SYNC_END_HEAD is still taken from the temporary worktree’s HEAD, even though that commit wasn’t cherry-picked back. Consider recomputing SYNC_START_HEAD/SYNC_END_HEAD from the current worktree after cherry-picking (or have invoke_temporary_worktree_sync update these vars based on what was actually applied).
| if [[ -n "$(git -C "$work_repo_root" status --porcelain -- "$PREFIX")" ]]; then | ||
| ensure_git_identity "$work_repo_root" | ||
| invoke_git -C "$work_repo_root" add -- "$PREFIX" | ||
| invoke_git -C "$work_repo_root" commit -m "Rewrite vendored CuTe namespace to flash_sparse_attn.ops.cute" |
There was a problem hiding this comment.
REWRITE_COMMIT_MESSAGE is defined and used to identify the rewrite commit later, but the commit here hard-codes the message string. If the message ever changes, the filter logic can silently break. Use the shared REWRITE_COMMIT_MESSAGE value when creating the commit to keep these paths consistent.
| invoke_git -C "$work_repo_root" commit -m "Rewrite vendored CuTe namespace to flash_sparse_attn.ops.cute" | |
| invoke_git -C "$work_repo_root" commit -m "$REWRITE_COMMIT_MESSAGE" |
| --cache-dir <path> Local cache path used when upstream-repo is a URL. | ||
| --skip-fetch Skip git fetch origin in the upstream cache. | ||
| --keep-temp-branch Keep the temporary split branch for debugging. | ||
| --no-temporary-worktree Fail instead of using a temporary worktree when the current tree is dirty. |
There was a problem hiding this comment.
The new --no-temporary-worktree option line in the usage output is mis-indented compared to the other options, which makes the help text harder to read. Align the indentation with the rest of the option list.
| --no-temporary-worktree Fail instead of using a temporary worktree when the current tree is dirty. | |
| --no-temporary-worktree Fail instead of using a temporary worktree when the current tree is dirty. |
| Write-Host "Done." | ||
| Write-Host "Upstream source: $UpstreamRepo" | ||
| Write-Host "Upstream cache used for subtree split: $upstreamRepoForSplit" | ||
| Write-Host "Synced commit range: $($syncResult.StartHead) -> $($syncResult.EndHead)" |
There was a problem hiding this comment.
The printed synced commit range can be misleading when using the temporary-worktree path. Invoke-CoreSync returns StartHead/EndHead from the temporary worktree, but the current worktree may not end up at EndHead if you skip cherry-picking the rewrite commit (and instead apply the rewrite without committing). Consider recomputing EndHead from the current repo after cherry-picking (or have Invoke-TemporaryWorktreeSync return the actual applied range).
| Write-Host "Done." | |
| Write-Host "Upstream source: $UpstreamRepo" | |
| Write-Host "Upstream cache used for subtree split: $upstreamRepoForSplit" | |
| Write-Host "Synced commit range: $($syncResult.StartHead) -> $($syncResult.EndHead)" | |
| # Recompute the effective end commit from the current repository after syncing, | |
| # since the temporary worktree's EndHead may not match the final HEAD. | |
| $printedEndHead = Get-GitOutput -Repo $repoRoot -Arguments @("rev-parse", "HEAD") | |
| Write-Host "Done." | |
| Write-Host "Upstream source: $UpstreamRepo" | |
| Write-Host "Upstream cache used for subtree split: $upstreamRepoForSplit" | |
| Write-Host "Synced commit range: $($syncResult.StartHead) -> $printedEndHead" |
| if ($prefixStatus) { | ||
| Ensure-GitIdentity -Repo $WorkRepoRoot | ||
| Invoke-Git -Repo $WorkRepoRoot -Arguments @("add", "--", $Prefix) | ||
| Invoke-Git -Repo $WorkRepoRoot -Arguments @("commit", "-m", "Rewrite vendored CuTe namespace to flash_sparse_attn.ops.cute") |
There was a problem hiding this comment.
$RewriteCommitMessage is used to detect the rewrite commit later, but the commit message here is hard-coded. If either string changes, the skip logic in Invoke-TemporaryWorktreeSync can stop matching. Use $RewriteCommitMessage when creating the commit to keep the behavior consistent.
| Invoke-Git -Repo $WorkRepoRoot -Arguments @("commit", "-m", "Rewrite vendored CuTe namespace to flash_sparse_attn.ops.cute") | |
| Invoke-Git -Repo $WorkRepoRoot -Arguments @("commit", "-m", $RewriteCommitMessage) |
Summary
Root Cause
Changes
flash_sparse_attn.ops.cute.sync_cute_subtreescript for better error handling and temporary worktree support.Reproduction
Tests
Compatibility
flash_attn.cuteimports.Checklist