diff --git a/README.md b/README.md index ad27db2..f57ef4e 100644 --- a/README.md +++ b/README.md @@ -118,32 +118,29 @@ The following table shows the backward pass performance comparison between Flash ## Installation -### Prerequisites +### Requirements -- **Python**: 3.8 or later -- **PyTorch**: 2.0.0 or later -- **CUDA**: 11.8 or later +- **Linux**: Ubuntu 22.04 or later - **NVIDIA GPU**: Compute Capability 8.0 or higher - **C++ Compiler**: GCC 7+ +- **CUDA**: 11.8 or later +- **Python**: 3.9 or later +- **PyTorch**: 2.5.1 or later -### CUDA Environment Setup +### Install -Ensure your CUDA environment is properly configured: +You can install Flash-DMA via pre-compiled wheels: ```bash -# Check CUDA installation -nvcc --version - -# Set CUDA_HOME if needed -export CUDA_HOME=/usr/local/cuda +pip install flash-dmattn --no-build-isolation ``` -### Install from Source +Alternatively, you can compile and install from source: ```bash git clone https://github.com/SmallDoges/flash-dmattn.git cd flash-dmattn -MAX_JOBS=4 pip install . --no-build-isolation +pip install . --no-build-isolation ``` diff --git a/README_zh.md b/README_zh.md index f8264b6..466d431 100644 --- a/README_zh.md +++ b/README_zh.md @@ -118,32 +118,29 @@ Flash-DMA 是一个高性能的注意力实现,将 Flash Attention 的内存 ## 安装 -### 先决条件 +### 依赖 -- **Python**: 3.8 或更高版本 -- **PyTorch**: 2.0.0 或更高版本 -- **CUDA**: 11.8 或更高版本 +- **Linux**: Ubuntu 22.04 或更高版本 - **NVIDIA GPU**: 计算能力 8.0 或更高 - **C++ 编译器**: GCC 7+ +- **CUDA**: 11.8 或更高版本 +- **Python**: 3.9 或更高版本 +- **PyTorch**: 2.5.1 或更高版本 -### CUDA 环境设置 +### 安装 -确保您的 CUDA 环境已正确配置: +您可以通过预编译的轮子安装 Flash-DMA: ```bash -# 检查 CUDA 安装 -nvcc --version - -# 如需要,设置 CUDA_HOME -export CUDA_HOME=/usr/local/cuda +pip install flash-dmattn --no-build-isolation ``` -### 从源码安装 +或者,您可以从源代码编译和安装: ```bash git clone https://github.com/SmallDoges/flash-dmattn.git cd flash-dmattn -MAX_JOBS=4 pip install . --no-build-isolation +pip install . --no-build-isolation ```