-
Notifications
You must be signed in to change notification settings - Fork 13.7k
Closed
Labels
Description
Git commit
Operating systems
Linux, Other? (Please let us know in description)
GGML backends
Vulkan
Problem description & steps to reproduce
I have followed all instructions, all existing solutions to build Vulkan on Android using cross compilation method. I just can not seem to make it work. The cli just aborts without explanation.
My phone is Redmi Note 13 Pro 5G. Using qualcomm CPU and Adreno GPU.
Operating System I use to cross-compile: Linux. Although, I also tried to cross compile it on Windows with the exact same issue.
NDK=26 and 28 give the same result
I have attached the log output below. Thank you in advance!
First Bad Commit
No response
Compile command
cmake -DCMAKE_TOOLCHAIN_FILE=$ANDROID_NDK/build/cmake/android.toolchain.cmake -DANDROID_ABI=arm64-v8a -DANDROID_PLATFORM=latest -DCMAKE_C_FLAGS=-march=armv8.4a+dotprod -DGGML_VULKAN=ON -DGGML_VULKAN_CHECK_RESULTS=OFF -DGGML_VULKAN_DEBUG=ON -DGGML_VULKAN_MEMORY_DEBUG=ON -DGGML_VULKAN_SHADER_DEBUG_INFO=ON -DGGML_VULKAN_PERF=OFF -DGGML_VULKAN_VALIDATE=OFF -DGGML_VULKAN_RUN_TESTS=OFF -DVK_USE_PLATFORM_ANDROID_KHR=ON -B build-android
cmake --build build-android --config Release -j8
cmake --install build-android --prefix install-android --config Release
adb push install-android /data/local/tmp/Relevant log output
ggml_vk_instance_init()
ggml_vulkan: Found 1 Vulkan devices:
ggml_vk_print_gpu_info(0)
ggml_vulkan: 0 = Adreno (TM) 710 (Qualcomm Technologies Inc. Adreno Vulkan Driver) | uma: 1 | fp16: 1 | warp size: 64 | matrix cores: none
build: 4520 (2139667e) with cc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 for x86_64-linux-gnu
llama_model_load_from_file_impl: using device Vulkan0 (Adreno (TM) 710) - 7301 MiB free
llama_model_loader: loaded meta data with 37 key-value pairs and 338 tensors from /data/local/tmp/Qwen2-VL-2B-Instruct-Q4_K_L.gguf (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv 0: general.architecture str = qwen2vl
llama_model_loader: - kv 1: general.type str = model
llama_model_loader: - kv 2: general.name str = Qwen2 VL 2B Instruct
llama_model_loader: - kv 3: general.finetune str = Instruct
llama_model_loader: - kv 4: general.basename str = Qwen2-VL
llama_model_loader: - kv 5: general.size_label str = 2B
llama_model_loader: - kv 6: general.license str = apache-2.0
llama_model_loader: - kv 7: general.base_model.count u32 = 1
llama_model_loader: - kv 8: general.base_model.0.name str = Qwen2 VL 2B
llama_model_loader: - kv 9: general.base_model.0.organization str = Qwen
llama_model_loader: - kv 10: general.base_model.0.repo_url str = https://huggingface.co/Qwen/Qwen2-VL-2B
llama_model_loader: - kv 11: general.tags arr[str,2] = ["multimodal", "image-text-to-text"]
llama_model_loader: - kv 12: general.languages arr[str,1] = ["en"]
llama_model_loader: - kv 13: qwen2vl.block_count u32 = 28
llama_model_loader: - kv 14: qwen2vl.context_length u32 = 32768
llama_model_loader: - kv 15: qwen2vl.embedding_length u32 = 1536
llama_model_loader: - kv 16: qwen2vl.feed_forward_length u32 = 8960
llama_model_loader: - kv 17: qwen2vl.attention.head_count u32 = 12
llama_model_loader: - kv 18: qwen2vl.attention.head_count_kv u32 = 2
llama_model_loader: - kv 19: qwen2vl.rope.freq_base f32 = 1000000.000000
llama_model_loader: - kv 20: qwen2vl.attention.layer_norm_rms_epsilon f32 = 0.000001
llama_model_loader: - kv 21: general.file_type u32 = 15
llama_model_loader: - kv 22: qwen2vl.rope.dimension_sections arr[i32,4] = [16, 24, 24, 0]
llama_model_loader: - kv 23: tokenizer.ggml.model str = gpt2
llama_model_loader: - kv 24: tokenizer.ggml.pre str = qwen2
llama_model_loader: - kv 25: tokenizer.ggml.tokens arr[str,151936] = ["!", "\"", "#", "$", "%", "&", "'", ...
llama_model_loader: - kv 26: tokenizer.ggml.token_type arr[i32,151936] = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
llama_model_loader: - kv 27: tokenizer.ggml.merges arr[str,151387] = ["Ġ Ġ", "ĠĠ ĠĠ", "i n", "Ġ t",...
llama_model_loader: - kv 28: tokenizer.ggml.eos_token_id u32 = 151645
llama_model_loader: - kv 29: tokenizer.ggml.padding_token_id u32 = 151643
llama_model_loader: - kv 30: tokenizer.ggml.bos_token_id u32 = 151643
llama_model_loader: - kv 31: tokenizer.chat_template str = {% set image_count = namespace(value=...
llama_model_loader: - kv 32: general.quantization_version u32 = 2
llama_model_loader: - kv 33: quantize.imatrix.file str = /models_out/Qwen2-VL-2B-Instruct-GGUF...
llama_model_loader: - kv 34: quantize.imatrix.dataset str = /training_dir/calibration_datav3.txt
llama_model_loader: - kv 35: quantize.imatrix.entries_count i32 = 196
llama_model_loader: - kv 36: quantize.imatrix.chunks_count i32 = 128
llama_model_loader: - type f32: 141 tensors
llama_model_loader: - type q8_0: 1 tensors
llama_model_loader: - type q4_K: 168 tensors
llama_model_loader: - type q6_K: 28 tensors
print_info: file format = GGUF V3 (latest)
print_info: file type = Q4_K - Medium
print_info: file size = 988.60 MiB (5.37 BPW)
load: special tokens cache size = 14
ggml_vk_get_device(0)
Initializing new vk_device
load: token to piece cache size = 0.9309 MB
print_info: arch = qwen2vl
print_info: vocab_only = 0
print_info: n_ctx_train = 32768
print_info: n_embd = 1536
ggml_vk_find_queue_family_index()print_info: n_layer = 28
ggml_vk_find_queue_family_index()
print_info: n_head = 12
print_info: n_head_kv = 2
print_info: n_rot = 128
print_info: n_swa = 0
print_info: n_embd_head_k = 128
print_info: n_embd_head_v = 128
print_info: n_gqa = 6
print_info: n_embd_k_gqa = 256
print_info: n_embd_v_gqa = 256
print_info: f_norm_eps = 0.0e+00
print_info: f_norm_rms_eps = 1.0e-06
print_info: f_clamp_kqv = 0.0e+00
print_info: f_max_alibi_bias = 0.0e+00
print_info: f_logit_scale = 0.0e+00
print_info: n_ff = 8960
print_info: n_expert = 0
print_info: n_expert_used = 0
print_info: causal attn = 1
print_info: pooling type = 0
print_info: rope type = 8
print_info: rope scaling = linear
print_info: freq_base_train = 1000000.0
print_info: freq_scale_train = 1
print_info: n_ctx_orig_yarn = 32768
print_info: rope_finetuned = unknown
print_info: ssm_d_conv = 0
print_info: ssm_d_inner = 0
print_info: ssm_d_state = 0
print_info: ssm_dt_rank = 0
print_info: ssm_dt_b_c_rms = 0
print_info: model type = 1.5B
print_info: model params = 1.54 B
print_info: general.name = Qwen2 VL 2B Instruct
print_info: vocab type = BPE
print_info: n_vocab = 151936
print_info: n_merges = 151387
print_info: BOS token = 151643 '<|endoftext|>'
print_info: EOS token = 151645 '<|im_end|>'
print_info: EOT token = 151645 '<|im_end|>'
print_info: PAD token = 151643 '<|endoftext|>'
print_info: LF token = 148848 'ÄĬ'
print_info: EOG token = 151643 '<|endoftext|>'
print_info: EOG token = 151645 '<|im_end|>'
print_info: max token length = 256
ggml_vk_create_queue()
ggml_vk_load_shaders(Vulkan0)
ggml_vulkan: Compiling shadersggml_vk_create_pipeline(ggml_vk_create_pipeline(Vulkan0, matmul_f32_f32_m, main, 3ggml_vk_create_pipeline(, 56, (64Vulkan0, matmul_f32_f32_l, main, 3ggml_vk_create_pipeline(, 56, (128,64,Vulkan0ggml_vk_create_pipeline(,ggml_vk_create_pipeline(Vulkan0ggml_vk_create_pipeline(, matmul_f32_f32_aligned_s, main, matmul_f32_f32_s, Vulkan0ggml_vk_create_pipeline(1128,1), specialization_constants, 1Vulkan0, , , 3), specialization_constants, 1, main, 56, (32Vulkan0matmul_f32_f16_l, matmul_f32_f32_aligned_m, , main, 3, 0, 0, 00, 0, 3, mainmatmul_f32_f16_m, , 0)
, 56, (128,128,1), specialization_constants, 1, 0, 0Vulkan0, 0)
main, 3, 56, (56, (32,32,1), specialization_constants, 1, 0, 0, 0)
, 3, 56, (64,64,1), specialization_constants, 64, 0, 0, 0)
)
, matmul_f32_f32_aligned_l, main, 3, 56,64,64,1), specialization_constants, 1, 0, 0, 0)
32,1), specialization_constants, 32, 0, , (128,128,1), specialization_constants, 128, 0, 0, 0)
0, 0)
.ggml_vk_create_pipeline(Vulkan0, matmul_f32_f16_s, main, 3, 56, (32,32,1), specialization_constants, 1, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_f32_f16_aligned_l, main, 3, 56, (128,128,1), specialization_constants, 128, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_f32_f16_aligned_m, main, 3, 56, (64,64,1), specialization_constants, 64, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_f32_f16_aligned_s, main, 3, 56, (32,32,1), specialization_constants, 32, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_f16_f16acc_l, main, 3, 56, (128,128,1), specialization_constants, 1, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_f16_f16acc_m, main, 3, 56, (64,64,1), specialization_constants, 1, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_f16_f16acc_s, main, 3, 56, (32,32,1), specialization_constants, 1, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_f16_f16acc_aligned_l, main, 3, 56, (128,128,1), specialization_constants, 128, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_f16_f16acc_aligned_m, main, 3, 56, (64,64,1), specialization_constants, 64, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_f16_f16acc_aligned_s, main, 3, 56, (32,32,1), specialization_constants, 32, 0, 0, 0)
.ggml_vk_create_pipeline(Vulkan0, matmul_f16_l, main, 3, 56, (128,128,1), specialization_constants, 1, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_f16_m, main, 3, 56, (64,64,1), specialization_constants, 1, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_f16_s, main, 3, 56, (32,32,1), specialization_constants, 1, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_f16_aligned_l, main, 3, 56, (128,128,1), specialization_constants, 128, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_f16_aligned_m, main, 3, 56, (64,64,1), specialization_constants, 64, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_f16_aligned_s, main, 3, 56, (32,32,1), specialization_constants, 32, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_f16_f32_f16acc_l, main, 3, 56, (128,128,1), specialization_constants, 1, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_f16_f32_f16acc_m, main, 3, 56, (64,64,1), specialization_constants, 1, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_f16_f32_f16acc_s, main, 3, 56, (32,32,1), specialization_constants, 1, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_f16_f32_f16acc_aligned_l, main, 3, 56, (128,128,1), specialization_constants, 128, 0, 0, 0)
.ggml_vk_create_pipeline(Vulkan0, matmul_f16_f32_f16acc_aligned_m, main, 3, 56, (64,64,1), specialization_constants, 64, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_f16_f32_f16acc_aligned_s, main, 3, 56, (32,32,1), specialization_constants, 32, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_f16_f32_l, main, 3, 56, (128,128,1), specialization_constants, 1, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_f16_f32_m, main, 3, 56, (64,64,1), specialization_constants, 1, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_f16_f32_s, main, 3, 56, (32,32,1), specialization_constants, 1, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_f16_f32_aligned_l, main, 3, 56, (128,128,1), specialization_constants, 128, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_f16_f32_aligned_m, main, 3, 56, (64,64,1), specialization_constants, 64, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_f16_f32_aligned_s, main, 3, 56, (32,32,1), specialization_constants, 32, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_q4_0_f32_f16acc_l, main, 3, 56, (64,64,1), specialization_constants, 1, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_q4_0_f32_f16acc_m, main, 3, 56, (64,64,1), specialization_constants, 1, 0, 0, 0)
.ggml_vk_create_pipeline(Vulkan0, matmul_q4_0_f32_f16acc_s, main, 3, 56, (32,32,1), specialization_constants, 1, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_q4_0_f32_f16acc_aligned_l, main, 3, 56, (64,64,1), specialization_constants, 128, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_q4_0_f32_f16acc_aligned_m, main, 3, 56, (64,64,1), specialization_constants, 64, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_q4_0_f32_f16acc_aligned_s, main, 3, 56, (32,32,1), specialization_constants, 32, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_q4_1_f32_f16acc_l, main, 3, 56, (64,64,1), specialization_constants, 1, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_q4_1_f32_f16acc_m, main, 3, 56, (64,64,1), specialization_constants, 1, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_q4_1_f32_f16acc_s, main, 3, 56, (32,32,1), specialization_constants, 1, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_q4_1_f32_f16acc_aligned_l, main, 3, 56, (64,64,1), specialization_constants, 128, 0, 0ggml_vk_create_pipeline(Vulkan0, matmul_q4_1_f32_f16acc_aligned_m, main, 3, 56, (64,64,1), specialization_constants, 64, 0, 0, 0)
, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_q4_1_f32_f16acc_aligned_s, main, 3, 56, (32,32,1), specialization_constants, 32, 0, 0, 0)
.ggml_vk_create_pipeline(Vulkan0, matmul_q5_0_f32_f16acc_l, main, 3, 56, (64,64,1), specialization_constants, 1, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_q5_0_f32_f16acc_m, main, 3, 56, (64,64,1), specialization_constants, 1, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_q5_0_f32_f16acc_s, main, 3, 56, (32,32,1), specialization_constants, 1, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_q5_0_f32_f16acc_aligned_l, main, 3, 56, (64,64,1), specialization_constants, 128, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_q5_0_f32_f16acc_aligned_m, main, 3, 56, (64,64,1), specialization_constants, 64, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_q5_0_f32_f16acc_aligned_s, main, 3, 56, (32,32,1), specialization_constants, 32, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_q5_1_f32_f16acc_l, main, 3, 56, (64,64,1), specialization_constants, 1, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_q5_1_f32_f16acc_s, main, ggml_vk_create_pipeline(Vulkan0, matmul_q5_1_f32_f16acc_m, main, 3, 56, (64,64,1), specialization_constants, 1, 0, 0, 0)
3, 56, (32,32,1), specialization_constants, 1, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_q5_1_f32_f16acc_aligned_l, main, 3, 56, (64,64,1), specialization_constants, 128, 0, 0, 0)
.ggml_vk_create_pipeline(Vulkan0, matmul_q5_1_f32_f16acc_aligned_m, main, 3, 56, (64,64,1), specialization_constants, 64, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_q5_1_f32_f16acc_aligned_s, main, 3, 56, (32,32,1), specialization_constants, 32, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_q8_0_f32_f16acc_l, main, 3, 56, (64,64,1), specialization_constants, 1, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_q8_0_f32_f16acc_m, main, 3, 56, (64,64,1), specialization_constants, 1, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_q8_0_f32_f16acc_s, main, 3, 56, (32,32,1), specialization_constants, 1, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_q8_0_f32_f16acc_aligned_l, main, 3, 56, (64,64,1), specialization_constants, 128, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_q8_0_f32_f16acc_aligned_m, main, 3, 56, (64,64,1), specialization_constants, 64, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_q8_0_f32_f16acc_aligned_s, main, 3, 56, (32,32,1), specialization_constants, 32, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_q2_k_f32_f16acc_l, main, 3, 56, (64,64,1), specialization_constants, 1, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_q2_k_f32_f16acc_m, main, 3, 56, (64,64,1), specialization_constants, 1, 0, 0, 0)
.ggml_vk_create_pipeline(Vulkan0, matmul_q2_k_f32_f16acc_s, main, 3, 56, (32,32,1), specialization_constants, 1, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_q2_k_f32_f16acc_aligned_l, main, 3, 56, (64,64,1), specialization_constants, 128, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_q2_k_f32_f16acc_aligned_m, main, 3, 56, (64,64,1), specialization_constants, 64, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_q2_k_f32_f16acc_aligned_s, main, 3, 56, (32,32,1), specialization_constants, 32, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_q3_k_f32_f16acc_l, main, 3, 56, (64,64,1), specialization_constants, 1, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_q3_k_f32_f16acc_m, main, 3, 56, (64,64,1), specialization_constants, 1, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_q3_k_f32_f16acc_s, main, 3, 56, (32,32,1), specialization_constants, 1, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_q3_k_f32_f16acc_aligned_l, main, 3, 56, (64,64,1), specialization_constants, 128, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_q3_k_f32_f16acc_aligned_m, main, 3, 56, (64,64,1), specialization_constants, 64, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_q3_k_f32_f16acc_aligned_s, main, 3, 56, (32,32,1), specialization_constants, 32, 0, 0, 0)
.ggml_vk_create_pipeline(Vulkan0, matmul_q4_k_f32_f16acc_l, main, 3, 56, (64,64,1), specialization_constants, 1, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_q4_k_f32_f16acc_m, main, 3, 56, (64,64,1), specialization_constants, 1, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_q4_k_f32_f16acc_s, main, 3, 56, (32,32,1), specialization_constants, 1, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_q4_k_f32_f16acc_aligned_l, main, 3, 56, (64,64,1), specialization_constants, 128, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_q4_k_f32_f16acc_aligned_m, main, 3, 56, (64,64,1), specialization_constants, 64, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_q5_k_f32_f16acc_l, main, 3, 56, (64,64,1), specialization_constants, 1, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_q4_k_f32_f16acc_aligned_s, main, 3, 56, (32,32,1), specialization_constants, 32, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_q5_k_f32_f16acc_m, main, 3, 56, (64,64,1), specialization_constants, 1, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_q5_k_f32_f16acc_s, main, 3, 56, (32,32,1), specialization_constants, 1, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_q5_k_f32_f16acc_aligned_l, main, 3, 56, (64,64,1), specialization_constants, 128, 0, 0, 0)
.ggml_vk_create_pipeline(Vulkan0, matmul_q5_k_f32_f16acc_aligned_m, main, 3, 56, (64,64,1), specialization_constants, 64, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_q5_k_f32_f16acc_aligned_s, main, 3, 56, (32,32,1), specialization_constants, 32, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_q6_k_f32_f16acc_l, main, 3, 56, (64,64,1), specialization_constants, 1, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_q6_k_f32_f16acc_m, main, 3, 56, (64,64,1), specialization_constants, 1, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_q6_k_f32_f16acc_s, main, 3, 56, (32,32,1), specialization_constants, 1, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_q6_k_f32_f16acc_aligned_l, main, 3, 56, (64,64,1), specialization_constants, 128, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_q6_k_f32_f16acc_aligned_m, main, 3, 56, (64,64,1), specialization_constants, 64, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_q6_k_f32_f16acc_aligned_s, main, 3, 56, (32,32,1), specialization_constants, 32, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_iq4_nl_f32_f16acc_l, main, 3, 56, (64,64,1), specialization_constants, 1, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_iq4_nl_f32_f16acc_m, main, 3, 56, (64,64,1), specialization_constants, 1, 0, 0, 0)
.ggml_vk_create_pipeline(Vulkan0, matmul_iq4_nl_f32_f16acc_s, main, 3, 56, (32,32,1), specialization_constants, 1, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_iq4_nl_f32_f16acc_aligned_l, main, 3, 56, (64,64,1), specialization_constants, 128, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_iq4_nl_f32_f16acc_aligned_m, main, 3, 56, (64,64,1), specialization_constants, 64, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_iq4_nl_f32_f16acc_aligned_s, main, 3, 56, (32,32,1), specialization_constants, 32, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_id_f32_f32_l, main, 4, 56, (128,128,1), specialization_constants, 1, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_id_f32_f32_m, main, 4, 56, (64,64,1), specialization_constants, 1, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_id_f32_f32_s, main, 4, 56, (32,32,1), specialization_constants, 1, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_id_f32_f32_aligned_l, main, 4, 56, (128,128,1), specialization_constants, 128, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_id_f32_f32_aligned_m, main, 4, 56, (64,64,1), specialization_constants, 64, 0, 0, 0)
ggml_vk_create_pipeline(Vulkan0, matmul_id_f32_f32_aligned_s, main, 4, 56, (32,32,1), specialization_constants, 32, 0, 0, 0)
Aborted