Draft: Integrate TQ2_0 into vulkan #22

makaveli10 · 2025-09-29T16:28:52Z

Convert 1bitLLM/bitnet_b1_58-large to TQ2_0:

python convert_hf_to_gguf.py 1bitLLM/bitnet_b1_58-large --remote --outtype f32
./build_vulkan/bin/llama-quantize 1bitLLM-bitnet_b1_58-large-f32.gguf TQ2_0

Run inference on vulkan with:

./build_vulkan/bin/llama-cli -m ggml-model-TQ2_0.gguf -if -p "Hello" -ngl 999

…a is provided

Signed-off-by: vineet <[email protected]>

…lation Signed-off-by: vineet <[email protected]>

This fixes the vkDeviceLostError on Mali

makaveli10 and others added 23 commits July 30, 2025 16:58

Add lora finetuning from adapter

d8ba38a

Add: create new lora adapter for target modules to finetune if no lor…

c8ea7b2

…a is provided

Fix identical loss over epochs; fix garbage lora initization

69c464a

Signed-off-by: vineet <[email protected]>

Remove lora training from finetune.cpp

ab0dae2

Signed-off-by: vineet <[email protected]>

Add adapter saving & other lora target modules

1b6e22d

Signed-off-by: vineet <[email protected]>

Add finetune-lora for lora finetuning in examples

c61a5af

Signed-off-by: vineet <[email protected]>

Add dequantization to out_prod cuda kernel

7799375

Signed-off-by: vineet <[email protected]>

Update README with finetune-lora

cbd4e1f

Signed-off-by: vineet <[email protected]>

Vulkan: add support for fp32 OUT_PROD op

cde18d5

CPU: add support for fp16_fp32 OUT_PROD op

b6b30de

Vulkan: add support for f16_f32 OUT_PROD op

5d4d767

Vulkan: Add Q4_0/Q8_0 OUT_PROD Vulkan support

e55225a

vulkan: Add initial cross entropy loss backward shader

c81f689

Signed-off-by: vineet <[email protected]>

vulkan: Fix cross-entropy-loss-back dispatch size and wg denominator

fc3ae66

Signed-off-by: vineet <[email protected]>

vulkan: Change uint32 cast to int32 for outprod; allows android compi…

055e30f

…lation Signed-off-by: vineet <[email protected]>

vulkan: Deallocate memory after destroying buffer

ae5ae40

vulkan: Set specialization constants to { 0 } for out_prod

794c37f

This fixes the vkDeviceLostError on Mali

vulkan: Set out_prod pipeline disable_robustness to true

36cfb76

Fix out_prod; vulkan ci issues

ea37326

Add GEGLU backward (Vulkan) to enable Gemma training.

4614f2d

ggml-vulkan: Add TQ2_0 dequantize and mul_mat vec

7d0f6db

ggml-vulkan: Enable coopmat support for Android

794f94c

ggml-vulkan: Add mul_mm path for TQ2_0

e05c62a

github-actions bot added Nvidia GPU Vulkan examples ggml testing labels Sep 29, 2025

makaveli10 changed the title ~~Integrate TQ2_0 into vulkan~~ Draft: Integrate TQ2_0 into vulkan Sep 29, 2025

zoq mentioned this pull request Oct 9, 2025

Integrate TQ2_0 into Vulkan #33

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Draft: Integrate TQ2_0 into vulkan #22

Draft: Integrate TQ2_0 into vulkan #22

Uh oh!

makaveli10 commented Sep 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Draft: Integrate TQ2_0 into vulkan #22

Are you sure you want to change the base?

Draft: Integrate TQ2_0 into vulkan #22

Uh oh!

Conversation

makaveli10 commented Sep 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants