-
Notifications
You must be signed in to change notification settings - Fork 13.4k
Closed
Labels
Description
Git commit
Operating systems
Linux
GGML backends
CUDA
Problem description & steps to reproduce
On a fresh vm, compiling with CUDA gives this error. Compiling with CPU works fine.
First Bad Commit
Unsure which exact commit it started with, but I was not getting errors ~3 days ago
Compile command
git clone https://github.com/ggerganov/llama.cpp
pip install -r llama.cpp/requirements.txt
cd llama.cpp
cmake -B build -DGGML_CUDA=ON -DLLAMA_CURL=OFF
cmake --build build --config Release -j 128
Relevant log output
[ 39%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs64-f16-f16.cu.o
[ 39%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs256-f16-f16.cu.o
[ 40%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-f16-f16.cu.o
[ 40%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs64-f16-f16.cu.o
[ 41%] Linking CXX shared library ../../bin/libggml-cpu.so
[ 41%] Built target ggml-cpu
/workspace/llama.cpp/ggml/src/ggml-cuda/conv2d.cu(104): error: more than one conversion function from "half" to a built-in type applies:
function "__half::operator float() const" (declared at line 217 of /usr/local/cuda/include/cuda_fp16.hpp)
function "__half::operator short() const" (declared at line 235 of /usr/local/cuda/include/cuda_fp16.hpp)
function "__half::operator unsigned short() const" (declared at line 238 of /usr/local/cuda/include/cuda_fp16.hpp)
function "__half::operator int() const" (declared at line 241 of /usr/local/cuda/include/cuda_fp16.hpp)
function "__half::operator unsigned int() const" (declared at line 244 of /usr/local/cuda/include/cuda_fp16.hpp)
function "__half::operator long long() const" (declared at line 247 of /usr/local/cuda/include/cuda_fp16.hpp)
function "__half::operator unsigned long long() const" (declared at line 250 of /usr/local/cuda/include/cuda_fp16.hpp)
function "__half::operator __nv_bool() const" (declared at line 254 of /usr/local/cuda/include/cuda_fp16.hpp)
acc += (input_val * kernel_val);
^
detected during:
instantiation of "void conv2d_kernel<T,Layout>(const float *, const T *, float *, conv_params) [with T=half, Layout=whcn_layout]" at line 116
instantiation of "void conv2d_cuda(const float *, const T *, float *, conv_params, cudaStream_t) [with T=half]" at line 120
/workspace/llama.cpp/ggml/src/ggml-cuda/conv2d.cu(104): error: more than one conversion function from "half" to a built-in type applies:
function "__half::operator float() const" (declared at line 217 of /usr/local/cuda/include/cuda_fp16.hpp)
function "__half::operator short() const" (declared at line 235 of /usr/local/cuda/include/cuda_fp16.hpp)
function "__half::operator unsigned short() const" (declared at line 238 of /usr/local/cuda/include/cuda_fp16.hpp)
function "__half::operator int() const" (declared at line 241 of /usr/local/cuda/include/cuda_fp16.hpp)
function "__half::operator unsigned int() const" (declared at line 244 of /usr/local/cuda/include/cuda_fp16.hpp)
function "__half::operator long long() const" (declared at line 247 of /usr/local/cuda/include/cuda_fp16.hpp)
function "__half::operator unsigned long long() const" (declared at line 250 of /usr/local/cuda/include/cuda_fp16.hpp)
function "__half::operator __nv_bool() const" (declared at line 254 of /usr/local/cuda/include/cuda_fp16.hpp)
acc += (input_val * kernel_val);
^
detected during:
instantiation of "void conv2d_kernel<T,Layout>(const float *, const T *, float *, conv_params) [with T=half, Layout=whcn_layout]" at line 116
instantiation of "void conv2d_cuda(const float *, const T *, float *, conv_params, cudaStream_t) [with T=half]" at line 120
2 errors detected in the compilation of "/workspace/llama.cpp/ggml/src/ggml-cuda/conv2d.cu".
gmake[2]: *** [ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/build.make:230: ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/conv2d.cu.o] Error 1
gmake[2]: *** Waiting for unfinished jobs....
gmake[1]: *** [CMakeFiles/Makefile2:1817: ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/all] Error 2
gmake: *** [Makefile:146: all] Error 2