Skip to content

关于环境 #9

@sove45

Description

@sove45

在我按照readme文件输入
执行完
conda install pytorch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 pytorch-cuda=12.1 -c pytorch -c nvidia -y
conda install pytorch-cluster pytorch-scatter pytorch-sparse -c pyg -y
(uni3r) zl@zl-8-3090:/home/dhw/zjb_workspace/uni3r/Uni3R-main$ conda install pytorch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 pytorch-cuda=12.1 -c pytorch -c nvidia -y
conda install pytorch-cluster pytorch-scatter pytorch-sparse -c pyg -y
Channels:

All requested packages already installed.

Channels:

All requested packages already installed.

pip install -r requirements.txt
指令后
在执行

pip install flash-attn --no-build-isolation
的时候出现了如下报错
(1.3.0)
Building wheels for collected packages: flash-attn
DEPRECATION: Building 'flash-attn' using the legacy setup.py bdist_wheel mechanism, which will be removed in a future version. pip 25.3 will enforce this behaviour change. A possible replacement is to use the standardized build interface by setting the --use-pep517 option, (possibly combined with --no-build-isolation), or adding a pyproject.toml file to the source tree of 'flash-attn'. Discussion can be found at pypa/pip#6334
Building wheel for flash-attn (setup.py) ... error
error: subprocess-exited-with-error

× python setup.py bdist_wheel did not run successfully.
│ exit code: 1
╰─> [296 lines of output]

  torch.__version__  = 2.1.0
  
  
  /home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/setuptools/__init__.py:85: _DeprecatedInstaller: setuptools.installer and fetch_build_eggs are deprecated.
  !!
  
          ********************************************************************************
          Requirements should be satisfied by a PEP 517 installer.
          If you are using pip, you can try `pip install --use-pep517`.
          ********************************************************************************
  
  !!
    dist.fetch_build_eggs(dist.setup_requires)
  running bdist_wheel
  Guessing wheel URL:  https://github.com/Dao-AILab/flash-attention/releases/download/v2.8.3/flash_attn-2.8.3+cu12torch2.1cxx11abiFALSE-cp310-cp310-linux_x86_64.whl
  Precompiled wheel not found. Building from source...
  running build
  running build_py
  creating build
  creating build/lib.linux-x86_64-cpython-310
  creating build/lib.linux-x86_64-cpython-310/hopper
  copying hopper/benchmark_attn.py -> build/lib.linux-x86_64-cpython-310/hopper
  copying hopper/__init__.py -> build/lib.linux-x86_64-cpython-310/hopper
  copying hopper/benchmark_split_kv.py -> build/lib.linux-x86_64-cpython-310/hopper
  copying hopper/generate_kernels.py -> build/lib.linux-x86_64-cpython-310/hopper
  copying hopper/benchmark_flash_attention_fp8.py -> build/lib.linux-x86_64-cpython-310/hopper
  copying hopper/setup.py -> build/lib.linux-x86_64-cpython-310/hopper
  copying hopper/test_util.py -> build/lib.linux-x86_64-cpython-310/hopper
  copying hopper/test_flash_attn.py -> build/lib.linux-x86_64-cpython-310/hopper
  copying hopper/test_kvcache.py -> build/lib.linux-x86_64-cpython-310/hopper
  copying hopper/padding.py -> build/lib.linux-x86_64-cpython-310/hopper
  copying hopper/flash_attn_interface.py -> build/lib.linux-x86_64-cpython-310/hopper
  copying hopper/test_attn_kvcache.py -> build/lib.linux-x86_64-cpython-310/hopper
  copying hopper/benchmark_mla_decode.py -> build/lib.linux-x86_64-cpython-310/hopper
  creating build/lib.linux-x86_64-cpython-310/flash_attn
  copying flash_attn/flash_blocksparse_attn_interface.py -> build/lib.linux-x86_64-cpython-310/flash_attn
  copying flash_attn/__init__.py -> build/lib.linux-x86_64-cpython-310/flash_attn
  copying flash_attn/flash_attn_triton_og.py -> build/lib.linux-x86_64-cpython-310/flash_attn
  copying flash_attn/bert_padding.py -> build/lib.linux-x86_64-cpython-310/flash_attn
  copying flash_attn/flash_attn_triton.py -> build/lib.linux-x86_64-cpython-310/flash_attn
  copying flash_attn/flash_attn_interface.py -> build/lib.linux-x86_64-cpython-310/flash_attn
  copying flash_attn/flash_blocksparse_attention.py -> build/lib.linux-x86_64-cpython-310/flash_attn
  creating build/lib.linux-x86_64-cpython-310/flash_attn/ops
  copying flash_attn/ops/activations.py -> build/lib.linux-x86_64-cpython-310/flash_attn/ops
  copying flash_attn/ops/__init__.py -> build/lib.linux-x86_64-cpython-310/flash_attn/ops
  copying flash_attn/ops/rms_norm.py -> build/lib.linux-x86_64-cpython-310/flash_attn/ops
  copying flash_attn/ops/fused_dense.py -> build/lib.linux-x86_64-cpython-310/flash_attn/ops
  copying flash_attn/ops/layer_norm.py -> build/lib.linux-x86_64-cpython-310/flash_attn/ops
  creating build/lib.linux-x86_64-cpython-310/flash_attn/utils
  copying flash_attn/utils/pretrained.py -> build/lib.linux-x86_64-cpython-310/flash_attn/utils
  copying flash_attn/utils/__init__.py -> build/lib.linux-x86_64-cpython-310/flash_attn/utils
  copying flash_attn/utils/benchmark.py -> build/lib.linux-x86_64-cpython-310/flash_attn/utils
  copying flash_attn/utils/library.py -> build/lib.linux-x86_64-cpython-310/flash_attn/utils
  copying flash_attn/utils/distributed.py -> build/lib.linux-x86_64-cpython-310/flash_attn/utils
  copying flash_attn/utils/generation.py -> build/lib.linux-x86_64-cpython-310/flash_attn/utils
  copying flash_attn/utils/testing.py -> build/lib.linux-x86_64-cpython-310/flash_attn/utils
  copying flash_attn/utils/torch.py -> build/lib.linux-x86_64-cpython-310/flash_attn/utils
  creating build/lib.linux-x86_64-cpython-310/flash_attn/layers
  copying flash_attn/layers/__init__.py -> build/lib.linux-x86_64-cpython-310/flash_attn/layers
  copying flash_attn/layers/patch_embed.py -> build/lib.linux-x86_64-cpython-310/flash_attn/layers
  copying flash_attn/layers/rotary.py -> build/lib.linux-x86_64-cpython-310/flash_attn/layers
  creating build/lib.linux-x86_64-cpython-310/flash_attn/flash_attn_triton_amd
  copying flash_attn/flash_attn_triton_amd/bwd_ref.py -> build/lib.linux-x86_64-cpython-310/flash_attn/flash_attn_triton_amd
  copying flash_attn/flash_attn_triton_amd/bwd_prefill_split.py -> build/lib.linux-x86_64-cpython-310/flash_attn/flash_attn_triton_amd
  copying flash_attn/flash_attn_triton_amd/fwd_prefill.py -> build/lib.linux-x86_64-cpython-310/flash_attn/flash_attn_triton_amd
  copying flash_attn/flash_attn_triton_amd/__init__.py -> build/lib.linux-x86_64-cpython-310/flash_attn/flash_attn_triton_amd
  copying flash_attn/flash_attn_triton_amd/fwd_decode.py -> build/lib.linux-x86_64-cpython-310/flash_attn/flash_attn_triton_amd
  copying flash_attn/flash_attn_triton_amd/fwd_ref.py -> build/lib.linux-x86_64-cpython-310/flash_attn/flash_attn_triton_amd
  copying flash_attn/flash_attn_triton_amd/bwd_prefill.py -> build/lib.linux-x86_64-cpython-310/flash_attn/flash_attn_triton_amd
  copying flash_attn/flash_attn_triton_amd/interface_fa.py -> build/lib.linux-x86_64-cpython-310/flash_attn/flash_attn_triton_amd
  copying flash_attn/flash_attn_triton_amd/utils.py -> build/lib.linux-x86_64-cpython-310/flash_attn/flash_attn_triton_amd
  copying flash_attn/flash_attn_triton_amd/bwd_prefill_onekernel.py -> build/lib.linux-x86_64-cpython-310/flash_attn/flash_attn_triton_amd
  copying flash_attn/flash_attn_triton_amd/fp8.py -> build/lib.linux-x86_64-cpython-310/flash_attn/flash_attn_triton_amd
  copying flash_attn/flash_attn_triton_amd/bench.py -> build/lib.linux-x86_64-cpython-310/flash_attn/flash_attn_triton_amd
  copying flash_attn/flash_attn_triton_amd/train.py -> build/lib.linux-x86_64-cpython-310/flash_attn/flash_attn_triton_amd
  copying flash_attn/flash_attn_triton_amd/bwd_prefill_fused.py -> build/lib.linux-x86_64-cpython-310/flash_attn/flash_attn_triton_amd
  copying flash_attn/flash_attn_triton_amd/test.py -> build/lib.linux-x86_64-cpython-310/flash_attn/flash_attn_triton_amd
  creating build/lib.linux-x86_64-cpython-310/flash_attn/modules
  copying flash_attn/modules/mha.py -> build/lib.linux-x86_64-cpython-310/flash_attn/modules
  copying flash_attn/modules/__init__.py -> build/lib.linux-x86_64-cpython-310/flash_attn/modules
  copying flash_attn/modules/block.py -> build/lib.linux-x86_64-cpython-310/flash_attn/modules
  copying flash_attn/modules/mlp.py -> build/lib.linux-x86_64-cpython-310/flash_attn/modules
  copying flash_attn/modules/embedding.py -> build/lib.linux-x86_64-cpython-310/flash_attn/modules
  creating build/lib.linux-x86_64-cpython-310/flash_attn/losses
  copying flash_attn/losses/__init__.py -> build/lib.linux-x86_64-cpython-310/flash_attn/losses
  copying flash_attn/losses/cross_entropy.py -> build/lib.linux-x86_64-cpython-310/flash_attn/losses
  creating build/lib.linux-x86_64-cpython-310/flash_attn/models
  copying flash_attn/models/bert.py -> build/lib.linux-x86_64-cpython-310/flash_attn/models
  copying flash_attn/models/__init__.py -> build/lib.linux-x86_64-cpython-310/flash_attn/models
  copying flash_attn/models/opt.py -> build/lib.linux-x86_64-cpython-310/flash_attn/models
  copying flash_attn/models/gptj.py -> build/lib.linux-x86_64-cpython-310/flash_attn/models
  copying flash_attn/models/gpt.py -> build/lib.linux-x86_64-cpython-310/flash_attn/models
  copying flash_attn/models/falcon.py -> build/lib.linux-x86_64-cpython-310/flash_attn/models
  copying flash_attn/models/vit.py -> build/lib.linux-x86_64-cpython-310/flash_attn/models
  copying flash_attn/models/baichuan.py -> build/lib.linux-x86_64-cpython-310/flash_attn/models
  copying flash_attn/models/llama.py -> build/lib.linux-x86_64-cpython-310/flash_attn/models
  copying flash_attn/models/bigcode.py -> build/lib.linux-x86_64-cpython-310/flash_attn/models
  copying flash_attn/models/btlm.py -> build/lib.linux-x86_64-cpython-310/flash_attn/models
  copying flash_attn/models/gpt_neox.py -> build/lib.linux-x86_64-cpython-310/flash_attn/models
  creating build/lib.linux-x86_64-cpython-310/flash_attn/cute
  copying flash_attn/cute/__init__.py -> build/lib.linux-x86_64-cpython-310/flash_attn/cute
  copying flash_attn/cute/fast_math.py -> build/lib.linux-x86_64-cpython-310/flash_attn/cute
  copying flash_attn/cute/flash_bwd.py -> build/lib.linux-x86_64-cpython-310/flash_attn/cute
  copying flash_attn/cute/flash_bwd_preprocess.py -> build/lib.linux-x86_64-cpython-310/flash_attn/cute
  copying flash_attn/cute/hopper_helpers.py -> build/lib.linux-x86_64-cpython-310/flash_attn/cute
  copying flash_attn/cute/block_info.py -> build/lib.linux-x86_64-cpython-310/flash_attn/cute
  copying flash_attn/cute/utils.py -> build/lib.linux-x86_64-cpython-310/flash_attn/cute
  copying flash_attn/cute/blackwell_helpers.py -> build/lib.linux-x86_64-cpython-310/flash_attn/cute
  copying flash_attn/cute/flash_fwd.py -> build/lib.linux-x86_64-cpython-310/flash_attn/cute
  copying flash_attn/cute/named_barrier.py -> build/lib.linux-x86_64-cpython-310/flash_attn/cute
  copying flash_attn/cute/interface.py -> build/lib.linux-x86_64-cpython-310/flash_attn/cute
  copying flash_attn/cute/tile_scheduler.py -> build/lib.linux-x86_64-cpython-310/flash_attn/cute
  copying flash_attn/cute/pack_gqa.py -> build/lib.linux-x86_64-cpython-310/flash_attn/cute
  copying flash_attn/cute/flash_fwd_sm100.py -> build/lib.linux-x86_64-cpython-310/flash_attn/cute
  copying flash_attn/cute/seqlen_info.py -> build/lib.linux-x86_64-cpython-310/flash_attn/cute
  copying flash_attn/cute/flash_bwd_postprocess.py -> build/lib.linux-x86_64-cpython-310/flash_attn/cute
  copying flash_attn/cute/softmax.py -> build/lib.linux-x86_64-cpython-310/flash_attn/cute
  copying flash_attn/cute/mma_sm100_desc.py -> build/lib.linux-x86_64-cpython-310/flash_attn/cute
  copying flash_attn/cute/ampere_helpers.py -> build/lib.linux-x86_64-cpython-310/flash_attn/cute
  copying flash_attn/cute/pipeline.py -> build/lib.linux-x86_64-cpython-310/flash_attn/cute
  copying flash_attn/cute/mask.py -> build/lib.linux-x86_64-cpython-310/flash_attn/cute
  creating build/lib.linux-x86_64-cpython-310/flash_attn/ops/triton
  copying flash_attn/ops/triton/__init__.py -> build/lib.linux-x86_64-cpython-310/flash_attn/ops/triton
  copying flash_attn/ops/triton/cross_entropy.py -> build/lib.linux-x86_64-cpython-310/flash_attn/ops/triton
  copying flash_attn/ops/triton/k_activations.py -> build/lib.linux-x86_64-cpython-310/flash_attn/ops/triton
  copying flash_attn/ops/triton/linear.py -> build/lib.linux-x86_64-cpython-310/flash_attn/ops/triton
  copying flash_attn/ops/triton/mlp.py -> build/lib.linux-x86_64-cpython-310/flash_attn/ops/triton
  copying flash_attn/ops/triton/layer_norm.py -> build/lib.linux-x86_64-cpython-310/flash_attn/ops/triton
  copying flash_attn/ops/triton/rotary.py -> build/lib.linux-x86_64-cpython-310/flash_attn/ops/triton
  running build_ext
  /home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/torch/utils/cpp_extension.py:414: UserWarning: The detected CUDA version (12.6) has a minor version mismatch with the version that was used to compile PyTorch (12.1). Most likely this shouldn't be a problem.
    warnings.warn(CUDA_MISMATCH_WARN.format(cuda_str_version, torch.version.cuda))
  /home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/torch/utils/cpp_extension.py:424: UserWarning: There are no g++ version bounds defined for CUDA version 12.6
    warnings.warn(f'There are no {compiler_name} version bounds defined for CUDA version {cuda_str_version}')
  building 'flash_attn_2_cuda' extension
  creating /tmp/pip-install-5k1ggor4/flash-attn_d78571d173624348b3e7e08df20a99e4/build/temp.linux-x86_64-cpython-310
  creating /tmp/pip-install-5k1ggor4/flash-attn_d78571d173624348b3e7e08df20a99e4/build/temp.linux-x86_64-cpython-310/csrc
  creating /tmp/pip-install-5k1ggor4/flash-attn_d78571d173624348b3e7e08df20a99e4/build/temp.linux-x86_64-cpython-310/csrc/flash_attn
  creating /tmp/pip-install-5k1ggor4/flash-attn_d78571d173624348b3e7e08df20a99e4/build/temp.linux-x86_64-cpython-310/csrc/flash_attn/src
  Emitting ninja build file /tmp/pip-install-5k1ggor4/flash-attn_d78571d173624348b3e7e08df20a99e4/build/temp.linux-x86_64-cpython-310/build.ninja...
  Compiling objects...
  Using envvar MAX_JOBS (7) as the number of workers...
  [1/73] c++ -MMD -MF /tmp/pip-install-5k1ggor4/flash-attn_d78571d173624348b3e7e08df20a99e4/build/temp.linux-x86_64-cpython-310/csrc/flash_attn/flash_api.o.d -pthread -B /home/dhw/anaconda3/envs/uni3r/compiler_compat -Wno-unused-result -Wsign-compare -DNDEBUG -fwrapv -O2 -Wall -fPIC -O2 -isystem /home/dhw/anaconda3/envs/uni3r/include -fPIC -O2 -isystem /home/dhw/anaconda3/envs/uni3r/include -fPIC -I/tmp/pip-install-5k1ggor4/flash-attn_d78571d173624348b3e7e08df20a99e4/csrc/flash_attn -I/tmp/pip-install-5k1ggor4/flash-attn_d78571d173624348b3e7e08df20a99e4/csrc/flash_attn/src -I/tmp/pip-install-5k1ggor4/flash-attn_d78571d173624348b3e7e08df20a99e4/csrc/cutlass/include -I/home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/torch/include -I/home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/torch/include/TH -I/home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/torch/include/THC -I/usr/local/cuda-12.6/include -I/home/dhw/anaconda3/envs/uni3r/include/python3.10 -c -c /tmp/pip-install-5k1ggor4/flash-attn_d78571d173624348b3e7e08df20a99e4/csrc/flash_attn/flash_api.cpp -o /tmp/pip-install-5k1ggor4/flash-attn_d78571d173624348b3e7e08df20a99e4/build/temp.linux-x86_64-cpython-310/csrc/flash_attn/flash_api.o -O3 -std=c++17 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=flash_attn_2_cuda -D_GLIBCXX_USE_CXX11_ABI=0
  FAILED: [code=1] /tmp/pip-install-5k1ggor4/flash-attn_d78571d173624348b3e7e08df20a99e4/build/temp.linux-x86_64-cpython-310/csrc/flash_attn/flash_api.o
  c++ -MMD -MF /tmp/pip-install-5k1ggor4/flash-attn_d78571d173624348b3e7e08df20a99e4/build/temp.linux-x86_64-cpython-310/csrc/flash_attn/flash_api.o.d -pthread -B /home/dhw/anaconda3/envs/uni3r/compiler_compat -Wno-unused-result -Wsign-compare -DNDEBUG -fwrapv -O2 -Wall -fPIC -O2 -isystem /home/dhw/anaconda3/envs/uni3r/include -fPIC -O2 -isystem /home/dhw/anaconda3/envs/uni3r/include -fPIC -I/tmp/pip-install-5k1ggor4/flash-attn_d78571d173624348b3e7e08df20a99e4/csrc/flash_attn -I/tmp/pip-install-5k1ggor4/flash-attn_d78571d173624348b3e7e08df20a99e4/csrc/flash_attn/src -I/tmp/pip-install-5k1ggor4/flash-attn_d78571d173624348b3e7e08df20a99e4/csrc/cutlass/include -I/home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/torch/include -I/home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/torch/include/TH -I/home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/torch/include/THC -I/usr/local/cuda-12.6/include -I/home/dhw/anaconda3/envs/uni3r/include/python3.10 -c -c /tmp/pip-install-5k1ggor4/flash-attn_d78571d173624348b3e7e08df20a99e4/csrc/flash_attn/flash_api.cpp -o /tmp/pip-install-5k1ggor4/flash-attn_d78571d173624348b3e7e08df20a99e4/build/temp.linux-x86_64-cpython-310/csrc/flash_attn/flash_api.o -O3 -std=c++17 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=flash_attn_2_cuda -D_GLIBCXX_USE_CXX11_ABI=0
  /tmp/pip-install-5k1ggor4/flash-attn_d78571d173624348b3e7e08df20a99e4/csrc/flash_attn/flash_api.cpp: In function ‘std::vector<at::Tensor> flash::mha_fwd(at::Tensor&, const at::Tensor&, const at::Tensor&, std::optional<at::Tensor>&, std::optional<at::Tensor>&, float, float, bool, int, int, float, bool, std::optional<at::Generator>)’:
  /tmp/pip-install-5k1ggor4/flash-attn_d78571d173624348b3e7e08df20a99e4/csrc/flash_attn/flash_api.cpp:489:13: error: invalid initialization of reference of type ‘const c10::optional<at::Generator>&’ from expression of type ‘std::optional<at::Generator>’
    489 |             gen_, at::cuda::detail::getDefaultCUDAGenerator());
        |             ^~~~
  In file included from /home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/torch/include/ATen/core/DeprecatedTypeProperties.h:9,
                   from /home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/torch/include/ATen/core/TensorBody.h:33,
                   from /home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/torch/include/ATen/core/Tensor.h:3,
                   from /home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/torch/include/torch/csrc/utils/variadic.h:3,
                   from /home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/detail/static.h:3,
                   from /home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/python.h:3,
                   from /tmp/pip-install-5k1ggor4/flash-attn_d78571d173624348b3e7e08df20a99e4/csrc/flash_attn/flash_api.cpp:6:
  /home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/torch/include/ATen/core/Generator.h:167:75: note: in passing argument 1 of ‘T* at::get_generator_or_default(const c10::optional<at::Generator>&, const at::Generator&) [with T = at::CUDAGeneratorImpl]’
    167 | static inline T* get_generator_or_default(const c10::optional<Generator>& gen, const Generator& default_gen) {
        |                                           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~
  /tmp/pip-install-5k1ggor4/flash-attn_d78571d173624348b3e7e08df20a99e4/csrc/flash_attn/flash_api.cpp: In function ‘std::vector<at::Tensor> flash::mha_varlen_fwd(at::Tensor&, const at::Tensor&, const at::Tensor&, std::optional<at::Tensor>&, const at::Tensor&, const at::Tensor&, std::optional<at::Tensor>&, std::optional<const at::Tensor>&, std::optional<at::Tensor>&, std::optional<at::Tensor>&, int, int, float, float, bool, bool, int, int, float, bool, std::optional<at::Generator>)’:
  /tmp/pip-install-5k1ggor4/flash-attn_d78571d173624348b3e7e08df20a99e4/csrc/flash_attn/flash_api.cpp:729:13: error: invalid initialization of reference of type ‘const c10::optional<at::Generator>&’ from expression of type ‘std::optional<at::Generator>’
    729 |             gen_, at::cuda::detail::getDefaultCUDAGenerator());
        |             ^~~~
  In file included from /home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/torch/include/ATen/core/DeprecatedTypeProperties.h:9,
                   from /home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/torch/include/ATen/core/TensorBody.h:33,
                   from /home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/torch/include/ATen/core/Tensor.h:3,
                   from /home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/torch/include/torch/csrc/utils/variadic.h:3,
                   from /home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/detail/static.h:3,
                   from /home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/python.h:3,
                   from /tmp/pip-install-5k1ggor4/flash-attn_d78571d173624348b3e7e08df20a99e4/csrc/flash_attn/flash_api.cpp:6:
  /home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/torch/include/ATen/core/Generator.h:167:75: note: in passing argument 1 of ‘T* at::get_generator_or_default(const c10::optional<at::Generator>&, const at::Generator&) [with T = at::CUDAGeneratorImpl]’
    167 | static inline T* get_generator_or_default(const c10::optional<Generator>& gen, const Generator& default_gen) {
        |                                           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~
  /tmp/pip-install-5k1ggor4/flash-attn_d78571d173624348b3e7e08df20a99e4/csrc/flash_attn/flash_api.cpp: In function ‘std::vector<at::Tensor> flash::mha_bwd(const at::Tensor&, const at::Tensor&, const at::Tensor&, const at::Tensor&, const at::Tensor&, const at::Tensor&, std::optional<at::Tensor>&, std::optional<at::Tensor>&, std::optional<at::Tensor>&, std::optional<at::Tensor>&, float, float, bool, int, int, float, bool, std::optional<at::Generator>, std::optional<at::Tensor>&)’:
  /tmp/pip-install-5k1ggor4/flash-attn_d78571d173624348b3e7e08df20a99e4/csrc/flash_attn/flash_api.cpp:937:9: error: invalid initialization of reference of type ‘const c10::optional<at::Generator>&’ from expression of type ‘std::optional<at::Generator>’
    937 |         gen_, at::cuda::detail::getDefaultCUDAGenerator());
        |         ^~~~
  In file included from /home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/torch/include/ATen/core/DeprecatedTypeProperties.h:9,
                   from /home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/torch/include/ATen/core/TensorBody.h:33,
                   from /home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/torch/include/ATen/core/Tensor.h:3,
                   from /home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/torch/include/torch/csrc/utils/variadic.h:3,
                   from /home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/detail/static.h:3,
                   from /home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/python.h:3,
                   from /tmp/pip-install-5k1ggor4/flash-attn_d78571d173624348b3e7e08df20a99e4/csrc/flash_attn/flash_api.cpp:6:
  /home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/torch/include/ATen/core/Generator.h:167:75: note: in passing argument 1 of ‘T* at::get_generator_or_default(const c10::optional<at::Generator>&, const at::Generator&) [with T = at::CUDAGeneratorImpl]’
    167 | static inline T* get_generator_or_default(const c10::optional<Generator>& gen, const Generator& default_gen) {
        |                                           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~
  /tmp/pip-install-5k1ggor4/flash-attn_d78571d173624348b3e7e08df20a99e4/csrc/flash_attn/flash_api.cpp: In function ‘std::vector<at::Tensor> flash::mha_varlen_bwd(const at::Tensor&, const at::Tensor&, const at::Tensor&, const at::Tensor&, const at::Tensor&, const at::Tensor&, std::optional<at::Tensor>&, std::optional<at::Tensor>&, std::optional<at::Tensor>&, const at::Tensor&, const at::Tensor&, std::optional<at::Tensor>&, int, int, float, float, bool, bool, int, int, float, bool, std::optional<at::Generator>, std::optional<at::Tensor>&)’:
  /tmp/pip-install-5k1ggor4/flash-attn_d78571d173624348b3e7e08df20a99e4/csrc/flash_attn/flash_api.cpp:1166:9: error: invalid initialization of reference of type ‘const c10::optional<at::Generator>&’ from expression of type ‘std::optional<at::Generator>’
   1166 |         gen_, at::cuda::detail::getDefaultCUDAGenerator());
        |         ^~~~
  In file included from /home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/torch/include/ATen/core/DeprecatedTypeProperties.h:9,
                   from /home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/torch/include/ATen/core/TensorBody.h:33,
                   from /home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/torch/include/ATen/core/Tensor.h:3,
                   from /home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/torch/include/torch/csrc/utils/variadic.h:3,
                   from /home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/detail/static.h:3,
                   from /home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/python.h:3,
                   from /tmp/pip-install-5k1ggor4/flash-attn_d78571d173624348b3e7e08df20a99e4/csrc/flash_attn/flash_api.cpp:6:
  /home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/torch/include/ATen/core/Generator.h:167:75: note: in passing argument 1 of ‘T* at::get_generator_or_default(const c10::optional<at::Generator>&, const at::Generator&) [with T = at::CUDAGeneratorImpl]’
    167 | static inline T* get_generator_or_default(const c10::optional<Generator>& gen, const Generator& default_gen) {
        |                                           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~
  [2/73] /usr/local/cuda-12.6/bin/nvcc  -I/tmp/pip-install-5k1ggor4/flash-attn_d78571d173624348b3e7e08df20a99e4/csrc/flash_attn -I/tmp/pip-install-5k1ggor4/flash-attn_d78571d173624348b3e7e08df20a99e4/csrc/flash_attn/src -I/tmp/pip-install-5k1ggor4/flash-attn_d78571d173624348b3e7e08df20a99e4/csrc/cutlass/include -I/home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/torch/include -I/home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/torch/include/TH -I/home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/torch/include/THC -I/usr/local/cuda-12.6/include -I/home/dhw/anaconda3/envs/uni3r/include/python3.10 -c -c /tmp/pip-install-5k1ggor4/flash-attn_d78571d173624348b3e7e08df20a99e4/csrc/flash_attn/src/flash_bwd_hdim192_bf16_causal_sm80.cu -o /tmp/pip-install-5k1ggor4/flash-attn_d78571d173624348b3e7e08df20a99e4/build/temp.linux-x86_64-cpython-310/csrc/flash_attn/src/flash_bwd_hdim192_bf16_causal_sm80.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 -std=c++17 -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -U__CUDA_NO_BFLOAT16_CONVERSIONS__ --expt-relaxed-constexpr --expt-extended-lambda --use_fast_math -gencode arch=compute_80,code=sm_80 -gencode arch=compute_90,code=sm_90 --threads 4 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=flash_attn_2_cuda -D_GLIBCXX_USE_CXX11_ABI=0
  [3/73] /usr/local/cuda-12.6/bin/nvcc  -I/tmp/pip-install-5k1ggor4/flash-attn_d78571d173624348b3e7e08df20a99e4/csrc/flash_attn -I/tmp/pip-install-5k1ggor4/flash-attn_d78571d173624348b3e7e08df20a99e4/csrc/flash_attn/src -I/tmp/pip-install-5k1ggor4/flash-attn_d78571d173624348b3e7e08df20a99e4/csrc/cutlass/include -I/home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/torch/include -I/home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/torch/include/TH -I/home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/torch/include/THC -I/usr/local/cuda-12.6/include -I/home/dhw/anaconda3/envs/uni3r/include/python3.10 -c -c /tmp/pip-install-5k1ggor4/flash-attn_d78571d173624348b3e7e08df20a99e4/csrc/flash_attn/src/flash_bwd_hdim128_fp16_causal_sm80.cu -o /tmp/pip-install-5k1ggor4/flash-attn_d78571d173624348b3e7e08df20a99e4/build/temp.linux-x86_64-cpython-310/csrc/flash_attn/src/flash_bwd_hdim128_fp16_causal_sm80.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 -std=c++17 -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -U__CUDA_NO_BFLOAT16_CONVERSIONS__ --expt-relaxed-constexpr --expt-extended-lambda --use_fast_math -gencode arch=compute_80,code=sm_80 -gencode arch=compute_90,code=sm_90 --threads 4 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=flash_attn_2_cuda -D_GLIBCXX_USE_CXX11_ABI=0
  [4/73] /usr/local/cuda-12.6/bin/nvcc  -I/tmp/pip-install-5k1ggor4/flash-attn_d78571d173624348b3e7e08df20a99e4/csrc/flash_attn -I/tmp/pip-install-5k1ggor4/flash-attn_d78571d173624348b3e7e08df20a99e4/csrc/flash_attn/src -I/tmp/pip-install-5k1ggor4/flash-attn_d78571d173624348b3e7e08df20a99e4/csrc/cutlass/include -I/home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/torch/include -I/home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/torch/include/TH -I/home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/torch/include/THC -I/usr/local/cuda-12.6/include -I/home/dhw/anaconda3/envs/uni3r/include/python3.10 -c -c /tmp/pip-install-5k1ggor4/flash-attn_d78571d173624348b3e7e08df20a99e4/csrc/flash_attn/src/flash_bwd_hdim128_bf16_causal_sm80.cu -o /tmp/pip-install-5k1ggor4/flash-attn_d78571d173624348b3e7e08df20a99e4/build/temp.linux-x86_64-cpython-310/csrc/flash_attn/src/flash_bwd_hdim128_bf16_causal_sm80.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 -std=c++17 -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -U__CUDA_NO_BFLOAT16_CONVERSIONS__ --expt-relaxed-constexpr --expt-extended-lambda --use_fast_math -gencode arch=compute_80,code=sm_80 -gencode arch=compute_90,code=sm_90 --threads 4 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=flash_attn_2_cuda -D_GLIBCXX_USE_CXX11_ABI=0
  [5/73] /usr/local/cuda-12.6/bin/nvcc  -I/tmp/pip-install-5k1ggor4/flash-attn_d78571d173624348b3e7e08df20a99e4/csrc/flash_attn -I/tmp/pip-install-5k1ggor4/flash-attn_d78571d173624348b3e7e08df20a99e4/csrc/flash_attn/src -I/tmp/pip-install-5k1ggor4/flash-attn_d78571d173624348b3e7e08df20a99e4/csrc/cutlass/include -I/home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/torch/include -I/home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/torch/include/TH -I/home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/torch/include/THC -I/usr/local/cuda-12.6/include -I/home/dhw/anaconda3/envs/uni3r/include/python3.10 -c -c /tmp/pip-install-5k1ggor4/flash-attn_d78571d173624348b3e7e08df20a99e4/csrc/flash_attn/src/flash_bwd_hdim192_bf16_sm80.cu -o /tmp/pip-install-5k1ggor4/flash-attn_d78571d173624348b3e7e08df20a99e4/build/temp.linux-x86_64-cpython-310/csrc/flash_attn/src/flash_bwd_hdim192_bf16_sm80.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 -std=c++17 -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -U__CUDA_NO_BFLOAT16_CONVERSIONS__ --expt-relaxed-constexpr --expt-extended-lambda --use_fast_math -gencode arch=compute_80,code=sm_80 -gencode arch=compute_90,code=sm_90 --threads 4 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=flash_attn_2_cuda -D_GLIBCXX_USE_CXX11_ABI=0
  [6/73] /usr/local/cuda-12.6/bin/nvcc  -I/tmp/pip-install-5k1ggor4/flash-attn_d78571d173624348b3e7e08df20a99e4/csrc/flash_attn -I/tmp/pip-install-5k1ggor4/flash-attn_d78571d173624348b3e7e08df20a99e4/csrc/flash_attn/src -I/tmp/pip-install-5k1ggor4/flash-attn_d78571d173624348b3e7e08df20a99e4/csrc/cutlass/include -I/home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/torch/include -I/home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/torch/include/TH -I/home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/torch/include/THC -I/usr/local/cuda-12.6/include -I/home/dhw/anaconda3/envs/uni3r/include/python3.10 -c -c /tmp/pip-install-5k1ggor4/flash-attn_d78571d173624348b3e7e08df20a99e4/csrc/flash_attn/src/flash_bwd_hdim128_bf16_sm80.cu -o /tmp/pip-install-5k1ggor4/flash-attn_d78571d173624348b3e7e08df20a99e4/build/temp.linux-x86_64-cpython-310/csrc/flash_attn/src/flash_bwd_hdim128_bf16_sm80.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 -std=c++17 -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -U__CUDA_NO_BFLOAT16_CONVERSIONS__ --expt-relaxed-constexpr --expt-extended-lambda --use_fast_math -gencode arch=compute_80,code=sm_80 -gencode arch=compute_90,code=sm_90 --threads 4 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=flash_attn_2_cuda -D_GLIBCXX_USE_CXX11_ABI=0
  [7/73] /usr/local/cuda-12.6/bin/nvcc  -I/tmp/pip-install-5k1ggor4/flash-attn_d78571d173624348b3e7e08df20a99e4/csrc/flash_attn -I/tmp/pip-install-5k1ggor4/flash-attn_d78571d173624348b3e7e08df20a99e4/csrc/flash_attn/src -I/tmp/pip-install-5k1ggor4/flash-attn_d78571d173624348b3e7e08df20a99e4/csrc/cutlass/include -I/home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/torch/include -I/home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/torch/include/TH -I/home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/torch/include/THC -I/usr/local/cuda-12.6/include -I/home/dhw/anaconda3/envs/uni3r/include/python3.10 -c -c /tmp/pip-install-5k1ggor4/flash-attn_d78571d173624348b3e7e08df20a99e4/csrc/flash_attn/src/flash_bwd_hdim128_fp16_sm80.cu -o /tmp/pip-install-5k1ggor4/flash-attn_d78571d173624348b3e7e08df20a99e4/build/temp.linux-x86_64-cpython-310/csrc/flash_attn/src/flash_bwd_hdim128_fp16_sm80.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 -std=c++17 -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -U__CUDA_NO_BFLOAT16_CONVERSIONS__ --expt-relaxed-constexpr --expt-extended-lambda --use_fast_math -gencode arch=compute_80,code=sm_80 -gencode arch=compute_90,code=sm_90 --threads 4 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=flash_attn_2_cuda -D_GLIBCXX_USE_CXX11_ABI=0
  ninja: build stopped: subcommand failed.
  Traceback (most recent call last):
    File "/tmp/pip-install-5k1ggor4/flash-attn_d78571d173624348b3e7e08df20a99e4/setup.py", line 486, in run
      urllib.request.urlretrieve(wheel_url, wheel_filename)
    File "/home/dhw/anaconda3/envs/uni3r/lib/python3.10/urllib/request.py", line 241, in urlretrieve
      with contextlib.closing(urlopen(url, data)) as fp:
    File "/home/dhw/anaconda3/envs/uni3r/lib/python3.10/urllib/request.py", line 216, in urlopen
      return opener.open(url, data, timeout)
    File "/home/dhw/anaconda3/envs/uni3r/lib/python3.10/urllib/request.py", line 525, in open
      response = meth(req, response)
    File "/home/dhw/anaconda3/envs/uni3r/lib/python3.10/urllib/request.py", line 634, in http_response
      response = self.parent.error(
    File "/home/dhw/anaconda3/envs/uni3r/lib/python3.10/urllib/request.py", line 563, in error
      return self._call_chain(*args)
    File "/home/dhw/anaconda3/envs/uni3r/lib/python3.10/urllib/request.py", line 496, in _call_chain
      result = func(*args)
    File "/home/dhw/anaconda3/envs/uni3r/lib/python3.10/urllib/request.py", line 643, in http_error_default
      raise HTTPError(req.full_url, code, msg, hdrs, fp)
  urllib.error.HTTPError: HTTP Error 404: Not Found
  
  During handling of the above exception, another exception occurred:
  
  Traceback (most recent call last):
    File "/home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 2100, in _run_ninja_build
      subprocess.run(
    File "/home/dhw/anaconda3/envs/uni3r/lib/python3.10/subprocess.py", line 526, in run
      raise CalledProcessError(retcode, process.args,
  subprocess.CalledProcessError: Command '['ninja', '-v', '-j', '7']' returned non-zero exit status 1.
  
  The above exception was the direct cause of the following exception:
  
  Traceback (most recent call last):
    File "<string>", line 2, in <module>
    File "<pip-setuptools-caller>", line 35, in <module>
    File "/tmp/pip-install-5k1ggor4/flash-attn_d78571d173624348b3e7e08df20a99e4/setup.py", line 526, in <module>
      setup(
    File "/home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/setuptools/__init__.py", line 108, in setup
      return distutils.core.setup(**attrs)
    File "/home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/setuptools/_distutils/core.py", line 184, in setup
      return run_commands(dist)
    File "/home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/setuptools/_distutils/core.py", line 200, in run_commands
      dist.run_commands()
    File "/home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 970, in run_commands
      self.run_command(cmd)
    File "/home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/setuptools/dist.py", line 945, in run_command
      super().run_command(command)
    File "/home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 989, in run_command
      cmd_obj.run()
    File "/tmp/pip-install-5k1ggor4/flash-attn_d78571d173624348b3e7e08df20a99e4/setup.py", line 503, in run
      super().run()
    File "/home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/setuptools/command/bdist_wheel.py", line 373, in run
      self.run_command("build")
    File "/home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/setuptools/_distutils/cmd.py", line 316, in run_command
      self.distribution.run_command(command)
    File "/home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/setuptools/dist.py", line 945, in run_command
      super().run_command(command)
    File "/home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 989, in run_command
      cmd_obj.run()
    File "/home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/setuptools/_distutils/command/build.py", line 135, in run
      self.run_command(cmd_name)
    File "/home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/setuptools/_distutils/cmd.py", line 316, in run_command
      self.distribution.run_command(command)
    File "/home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/setuptools/dist.py", line 945, in run_command
      super().run_command(command)
    File "/home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 989, in run_command
      cmd_obj.run()
    File "/home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/setuptools/command/build_ext.py", line 93, in run
      _build_ext.run(self)
    File "/home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 359, in run
      self.build_extensions()
    File "/home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 873, in build_extensions
      build_ext.build_extensions(self)
    File "/home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 479, in build_extensions
      self._build_extensions_serial()
    File "/home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 505, in _build_extensions_serial
      self.build_extension(ext)
    File "/home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/setuptools/command/build_ext.py", line 254, in build_extension
      _build_ext.build_extension(self, ext)
    File "/home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 560, in build_extension
      objects = self.compiler.compile(
    File "/home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 686, in unix_wrap_ninja_compile
      _write_ninja_file_and_compile_objects(
    File "/home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1774, in _write_ninja_file_and_compile_objects
      _run_ninja_build(
    File "/home/dhw/anaconda3/envs/uni3r/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 2116, in _run_ninja_build
      raise RuntimeError(message) from e
  RuntimeError: Error compiling objects for extension
  [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for flash-attn
Running setup.py clean for flash-attn
Failed to build flash-attn
error: failed-wheel-build-for-install

× Failed to build installable wheels for some pyproject.toml based projects
╰─> flash-attn
请问这种情况怎么解决

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions