-
Notifications
You must be signed in to change notification settings - Fork 580
Description
ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: __CUDA_ARCH_LIST__
Recently I've been trying to use my Quadro GP100 which is a Pascal graphics card for KoBoldCpp use, until I happened to encounter such error prompts.
Personally I think there's nothing to do with either the graphics drivers or the CUDA toolkits, for I already have CUDA versions 10.2, 11.7, 11.8, 12.1, 12.4 and the driver supporting CUDA 12.4 installed.
Furthermore, I've been running on another CUDA graphics card of Turing generation in the same machine with the Pascal one running together, completely on GPU itself on KoBoldCpp versions 1.60.1 and (cuda12.exe) 1.79.1 through the current 1.83, all of which seem without any issues. And I'm currently still running it through another KoBoldCpp instance.
So I am wondering, if there's something that breaks the compatibility with Pascal CUDA GPUs since version 1.79.1 onwards, for I've been tested versions 1.79.1, 1.81.1, 1.82.4, 1.83, all of which just throw out the same error of incompatibility.
Even though it is okay for me since I've been checking through all versions from 1.63 up to 1.78 to work fine on the Pascal graphics, I guess it would be my Pascal graphics' goodbye to the concurrent Deepseek models, since these did fail on my Turing graphics until I went here and checked out the latest version that time claiming to support for them.
--- Additional information ---
Operating system: Windows 10 21H2, without guests or VMs over it (Hyper-V also disabled) as the error still persists.
Here's how the command line prompt goes on version 1.79.1 shown below. The errors that have shown on versions 1.82.4 and 1.83 differ but slightly, mainly relying on the top.
Welcome to KoboldCpp - Version 1.79.1
For command line arguments, please refer to --help
Note: KoboldCpp has detected that a significant amount of GPU VRAM (6451.0 MB) is currently used by another application.
For best results, you may wish to close that application and then restart KoboldCpp.
Auto Selected CUDA Backend...
Attempting to use CuBLAS library for faster prompt ingestion. A compatible CuBLAS will be required.
Initializing dynamic library: koboldcpp_cublas.dll
Namespace(benchmark=None, blasbatchsize=512, blasthreads=8, chatcompletionsadapter=None, config=None, contextsize=4096, debugmode=0, draftamount=8, draftmodel=None, flashattention=False, forceversion=0, foreground=False, gpulayers=800, highpriority=False, hordeconfig=None, hordegenlen=0, hordekey='', hordemaxctx=0, hordemodelname='', hordeworkername='', host='localhost', ignoremissing=False, launch=False, lora=None, mmproj=None, model='', model_param='X:/qwen2.5-v1.gguf', multiplayer=False, multiuser=1, noavx2=False, noblas=False, nocertify=False, nofastforward=False, nommap=False, nomodel=False, noshift=False, onready='', password=None, port=5001, port_param=5002, preloadstory=None, prompt='', promptlimit=100, quantkv=0, quiet=False, remotetunnel=False, ropeconfig=[0.0, 10000.0], sdclamped=0, sdclipg='', sdclipl='', sdconfig=None, sdlora='', sdloramult=1.0, sdmodel='', sdquant=False, sdt5xxl='', sdthreads=8, sdvae='', sdvaeauto=False, showgui=False, skiplauncher=False, smartcontext=False, ssl=None, tensor_split=None, threads=8, unpack='', useclblast=None, usecpu=False, usecublas=['normal', '0'], usemlock=False, usevulkan=None, whispermodel='')
Loading model: X:\qwen2.5-v1.gguf
The reported GGUF Arch is: qwen2
Arch Category: 5
Identified as GGUF model: (ver 6)
Attempting to Load...
Using automatic RoPE scaling for GGUF. If the model has custom RoPE settings, they'll be used directly instead!
It means that the RoPE values written above will be replaced by the RoPE values indicated after loading.
System Info: AVX = 1 | AVX_VNNI = 0 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | AVX512_BF16 = 0 | AMX_INT8 = 0 | FMA = 1 | NEON = 0 | SVE = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | RISCV_VECT = 0 | WASM_SIMD = 0 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | MATMUL_INT8 = 0 | LLAMAFILE = 1 |
CUBLAS: Warning, you are running Qwen2 without Flash Attention and may observe incoherent output.
Initializing CUDA/HIP, please wait, the following step may take a few minutes for first launch...
ggml_cuda_init: found 1 CUDA devices:
Device 0: Quadro GP100, compute capability 6.0, VMM: yes
llama_load_model_from_file: using device CUDA0 (Quadro GP100) - 15205 MiB free
llama_model_loader: loaded meta data with 27 key-value pairs and 579 tensors from X:\qwen2.5-v1.R?2llm_load_vocab: special tokens cache size = 22
llm_load_vocab: token to piece cache size = 0.9310 MB
llm_load_print_meta: format = GGUF V3 (latest)
llm_load_print_meta: arch = qwen2
llm_load_print_meta: vocab type = BPE
llm_load_print_meta: n_vocab = 152064
llm_load_print_meta: n_merges = 151387
llm_load_print_meta: vocab_only = 0
llm_load_print_meta: n_ctx_train = 131072
llm_load_print_meta: n_embd = 5120
llm_load_print_meta: n_layer = 48
llm_load_print_meta: n_head = 40
llm_load_print_meta: n_head_kv = 8
llm_load_print_meta: n_rot = 128
llm_load_print_meta: n_swa = 0
llm_load_print_meta: n_embd_head_k = 128
llm_load_print_meta: n_embd_head_v = 128
llm_load_print_meta: n_gqa = 5
llm_load_print_meta: n_embd_k_gqa = 1024
llm_load_print_meta: n_embd_v_gqa = 1024
llm_load_print_meta: f_norm_eps = 0.0e+00
llm_load_print_meta: f_norm_rms_eps = 1.0e-05
llm_load_print_meta: f_clamp_kqv = 0.0e+00
llm_load_print_meta: f_max_alibi_bias = 0.0e+00
llm_load_print_meta: f_logit_scale = 0.0e+00
llm_load_print_meta: n_ff = 13824
llm_load_print_meta: n_expert = 0
llm_load_print_meta: n_expert_used = 0
llm_load_print_meta: causal attn = 1
llm_load_print_meta: pooling type = 0
llm_load_print_meta: rope type = 2
llm_load_print_meta: rope scaling = linear
llm_load_print_meta: freq_base_train = 1000000.0
llm_load_print_meta: freq_scale_train = 1
llm_load_print_meta: n_ctx_orig_yarn = 131072
llm_load_print_meta: rope_finetuned = unknown
llm_load_print_meta: ssm_d_conv = 0
llm_load_print_meta: ssm_d_inner = 0
llm_load_print_meta: ssm_d_state = 0
llm_load_print_meta: ssm_dt_rank = 0
llm_load_print_meta: ssm_dt_b_c_rms = 0
llm_load_print_meta: model type = 14B
llm_load_print_meta: model ftype = unknown, may not work (guessed)
llm_load_print_meta: model params = 14.77 B
llm_load_print_meta: model size = 8.37 GiB (4.87 BPW)
llm_load_print_meta: general.name = sakura-14b-qwen2.5-v1.0
llm_load_print_meta: BOS token = 151643 '<|endoftext|>'
llm_load_print_meta: EOS token = 151643 '<|endoftext|>'
llm_load_print_meta: EOT token = 151645 '<|im_end|>'
llm_load_print_meta: PAD token = 151643 '<|endoftext|>'
llm_load_print_meta: LF token = 148848 '脛默'
llm_load_print_meta: FIM PRE token = 151659 '<|fim_prefix|>'
llm_load_print_meta: FIM SUF token = 151661 '<|fim_suffix|>'
llm_load_print_meta: FIM MID token = 151660 '<|fim_middle|>'
llm_load_print_meta: FIM PAD token = 151662 '<|fim_pad|>'
llm_load_print_meta: FIM REP token = 151663 '<|repo_name|>'
llm_load_print_meta: FIM SEP token = 151664 '<|file_sep|>'
llm_load_print_meta: EOG token = 151643 '<|endoftext|>'
llm_load_print_meta: EOG token = 151645 '<|im_end|>'
llm_load_print_meta: EOG token = 151662 '<|fim_pad|>'
llm_load_print_meta: EOG token = 151663 '<|repo_name|>'
llm_load_print_meta: EOG token = 151664 '<|file_sep|>'
llm_load_print_meta: max token length = 256
llm_load_tensors: tensor 'token_embd.weight' (q4_K) (and 0 others) cannot be used with preferred buffer type CUDA_Host, using CP?每�2(This is not an error, it just means some tensors will use CPU instead.)
llm_load_tensors: offloading 48 repeating layers to GPU
llm_load_tensors: offloading output layer to GPU
llm_load_tensors: offloaded 49/49 layers to GPU
llm_load_tensors: CPU_Mapped model buffer size = 417.66 MiB
llm_load_tensors: CUDA0 model buffer size = 8148.38 MiB
...........................................................................................
Automatic RoPE Scaling: Using model internal value.
llama_new_context_with_model: n_seq_max = 1
llama_new_context_with_model: n_ctx = 4224
llama_new_context_with_model: n_ctx_per_seq = 4224
llama_new_context_with_model: n_batch = 512
llama_new_context_with_model: n_ubatch = 512
llama_new_context_with_model: flash_attn = 0
llama_new_context_with_model: freq_base = 1000000.0
llama_new_context_with_model: freq_scale = 1
llama_new_context_with_model: n_ctx_per_seq (4224) < n_ctx_train (131072) -- the full capacity of the model will not be utilizedb?2llama_kv_cache_init: CUDA0 KV buffer size = 792.00 MiB
llama_new_context_with_model: KV self size = 792.00 MiB, K (f16): 396.00 MiB, V (f16): 396.00 MiB
llama_new_context_with_model: CUDA_Host output buffer size = 0.58 MiB
llama_new_context_with_model: CUDA0 compute buffer size = 378.25 MiB
llama_new_context_with_model: CUDA_Host compute buffer size = 18.26 MiB
llama_new_context_with_model: graph nodes = 1686
llama_new_context_with_model: graph splits = 2
Load Text Model OK: True
Embedded KoboldAI Lite loaded.
Embedded API docs loaded.
Starting Kobold API on port 5002 at http://localhost:5002/api/
Starting OpenAI Compatible API on port 5002 at http://localhost:5002/v1/
Please connect to custom endpoint at http://localhost:5002
IPv6 Socket Failed to Bind. IPv6 will be unavailable.
Processing Prompt [BLAS] (2024 / 2024 tokens)
Generating (1 / 320 tokens)d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\mmv.cu:51: ERROR: CUDA kernel mul_mat_vec has no device code compatible with CUDA arch 520. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST
CUDA error: unspecified launch failure
current device: 0, in function ggml_backend_cuda_synchronize at d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\ggml-cuda.cu:2287
cudaStreamSynchronize(cuda_ctx->stream())
d:\a\koboldcpp\koboldcpp\ggml\src\ggml-cuda\ggml-cuda.cu:72: CUDA error