Windows dependencies are a problem right now. Note for future windows attempts

Windows is a tricky beast for running this model. Decided to test this model on the only box I have with the capacity - a windows 11 machine. 

This is as far as I can get:
 Downloaded the 7-8GB Spiking Brain-7B model weights
 Installed PyTorch 2.6.0 with CUDA 12.4 support
 Set up all CUDA paths and Triton compiler
 Model loads successfully - I can see it using GPU 30-90%
 Triton CUDA kernels compile correctly for my RTX 3080 Ti

  Current Blocker:
  The model requires flash-attn (Flash Attention 2.7.3), which doesn't officially support Windows. This is blocking
  the forward pass.

The answer may be WSL2 in Win11 for folks like me who are (for the moment) stuck with their best hardware on Windows environments. I'll update this comment if this is successful as it may help others.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Windows dependencies are a problem right now. Note for future windows attempts #30

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Windows dependencies are a problem right now. Note for future windows attempts #30

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions