Skip to content

Conversation

Wisdawn
Copy link

@Wisdawn Wisdawn commented Apr 2, 2025

Add Wheel & Docs: Flash Attention 2.7.4.post1 for Py3.12 / CUDA 12.1 / PyTorch 2.5.1

This PR adds support for Python 3.12 users by providing a pre-compiled wheel for flash-attn v2.7.4.post1 and updating the documentation accordingly.

Wheel Details:

  • Package: flash-attn
  • Version: 2.7.4.post1
  • Target Environment:
    • Python: 3.12.x (cp312)
    • CUDA Toolkit: 12.1
    • PyTorch: 2.5.1+cu121
    • OS: Windows x64
  • Build Context: Successfully built using Visual Studio 2022 LTSC v17.4.x toolchain.

Wheel File Location:

  • The .whl file (flash_attn-2.7.4.post1-cp312-cp312-win_amd64.whl) has been uploaded as a binary asset to the release tagged v2.7.4.post1_py312_cu121_torch251 on my fork (Wisdawn/flash-attention-windows). It is not included directly in this PR's file changes due to GitHub's file size limits.

README.md Updates in this PR:

  • Adds the Py3.12 wheel details to the "Available Wheels" section.
  • Adds prerequisites for the Py3.12 wheel under "Requirements".
  • Updates the "Installation" section to direct users to download wheels from the Releases page.
  • Adds SHA256 checksum for the new wheel under "Security".
  • Adds a detailed example build process for Py3.12/CUDA 12.1 under "Instructions for Building New Wheels".
  • Adds specific troubleshooting steps relevant to the Py3.12/CUDA 12.1 build under "Troubleshooting".
  • Adds a "Contributing Wheels" section explaining the PR process.
  • Minor updates to "Known Issues" and the main title.

Hoping this contribution helps other Windows users! Let me know if any changes are needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant